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nucleic - nucleic search, using sw model 

August 24, 2004, 12:03:10 ; Search time 4311 Seconds 

(without alignments) 
10688.319 Million cell updates/sec 

US-09-891-138A-1 
1543 

1 gctcctggcagagttttctg tgcctaaataaatcaatata 1543 

I DENT I T Y_NUC 
Gapop 10.0 , Gapext 1.0 

27513289 seqs, 14931090276 residues 
Total number of hits satisfying chosen parameters: 55026578 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : EST:* 



1: 


em estba:* 




2: 


em_esthum: * 




3: 


em estin:* 




4: 


em_estmu: * 




5: 


em_estov: * 




6: 


em_estpl : * 




7: 


em^estro : * 




8: 


em_htc: * 




9: 


gb_estl : * 




10: 


gb_est2:* 




11: 


gb_htc: * 




12: 


gb_est3 : * 




13: 


gb est4:* 




14: 


gb est5:* 




15: 


em est fun : * 


16: 


em estom: * 




17: 


em gss hum: 




18: 


em_gss inv: 




19: 


em_gss pin: 




20: 


em_gss vrt : 




21: 


em_gss fun: 


* 


22: 


em_gss mam: 




23: 


em_gss_mus : 


* 


24: 


em_gss pro: 


* 


25: 


em_gss rod: 




26: 


em_gss_phg: 




27: 


em_gss vrl : 





Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



28: gb_gssl:* 
29: gb_gss2:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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139 9 AI021184 



AK080866 Mus muscu 
BB323771 BB323771 
BX527630 BX527630 
AI663305 uk27cl0.y 
BB744515 BB744515 
BB746222 BB746222 
BB738743 BB738743 
BB847918 BB847918 
BB864882 BB864882 
BB778587 BB778587 
BB739482 BB739482 
AI649254 uk27cl0.x 
BB645274 BB645274 
BB846608 BB846608 
BY368584 BY368584 
CD24 6161 AGENCOURT 
CE610929 tigr-gss- 
BB220946 BB220946 
BG402029 602466748 
BB254869 BB254869 
BB220888 BB220888 
BB225749 BB225749 
BX281458 BX281458 
BB500452 BB500452 
BU141159 603137524 
BB327439 BB327439 
AG083174 Pan trogl 
BB498575 BB498575 
BB221521 BB221521 
BU291924 603606116 
BY005778 BY005778 
BB215653 BB215653 
BB498898 BB498898 
AL309576 Tetraodon 
BB213317 BB213317 
AL186565 Tetraodon 
BU373390 603811071 
AW112068 MC15648 m 
AW612141 hg94h07'.x 
BU352057 603527490 
AL317059 Tetraodon 
BF196066 hr81f02.x 
AL310077 Tetraodon 
BE221739 hr58c09.x 
AI021184 ub02fl2.r 



ALIGNMENTS 



RESULT 1 
AK080866 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 



AK080866 158 5 bp mRNA linear HTC 20-SEP-2003 

Mus musculus 4 days neonate male adipose cDNA, RIKEN full-length 
enriched library, clone : B430012O21 product : G-PROTEIN COUPLED 
RECEPTOR GPR91, full insert sequence. 
AK080866 

AK080866. 1 GI: 2 6099527 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

Carninci,P. and Hayashizaki , Y. 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 

Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new genes 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata,K. , 



Itoh,M. 



Konno,H., Akiyama,J. 



Aizawa,K., Nagaoka,S., Sasaki, N. , Carninci,P. 
f Nishi,K., Kitsunai,T., Tashiro,H., Itoh,M., 
Sumi,N., Ishii,Y., Nakamura,S., Hazama,M., Nishine,T., Harada,A. , 
Yamamoto,R., Matsumoto, H . , Sakaguchi, S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. and Hayashizaki, Y . 
RIKEN integrated sequence analysis (RISA) system— 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FAN TOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 1585) 



AUTHORS 



TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



misc feature 



Adachi,J., Aizawa,K., Akimura,T., Arakawa,T., Bono,H., Carninci, P . , 
Fukuda,S., Furuno,M., Hanagaki,T., Kara, A. , Hashizume, W. , 
Hayashida,K. , Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M., Kagawa,I., Kasukawa ,' T . ,' 
Katoh,H., Kawai,J., Kojima,Y., Kondo,S., Konno,H., Kouda,M. , 
Koya,S., Kurihara, C . , Matsuyama, T . , Miyazaki,A., Murata,M.,' 
Nakamura,M., Nishi,K., Nomura, K. , Numazaki,R., Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A. , Toya,T., Yasunishi,A. , 
Muramatsu,M. and Hayashizaki, Y. 
Direct Submission 

Submitted (16-APR-2002) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res@gsc . riken . go . jp, 
URL : http : / / genome . gsc . riken . go . jp/ , Tel : 81-45-503-9222 
Fax:81-45-503-9216) ' 
cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL: http : / / genome . gsc. riken. go . jp/ 
URL : http : / / f antom. gsc . riken . go . jp/ . 

Location/ Qualifiers 

1. .1585 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6J" 

/db_xref="FANTOM_DB:B430012O21" 
/db_xref="MGI: 2411980" 
/db_xref="taxon: 10090" 
/clone="B4 30012021" 
/sex="male" 

/ tissue_type="adipose" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="4 days neonate" 
69. .1025 

/note="G-PROTEIN COUPLED RECEPTOR GPR91 ( SPTR | Q99MT6, 
94.3%ID, 100%length, match=954) 



polyA_signal 
polyA_site 
ORIGIN 



evidence: FASTY, 
putative" 
1558. .1563 
/note="putative " 
1585 

/note= "putative" 



Query Match 96.2%; Score 1484.8; 

Best Local Similarity 98.4%; Pred. No. 0; 
Matches 1521; Conservative 0; Mismatches 



DB 11; 
22; Indels 



Length 1585; 

3; Gaps 



2; 



Qy 



G C T C C T G GC AGAGT T T T CT GT C GAGACAGAAG C C GACAGC AGAAT GG C AC AGAAT T TAT C 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | I | | 



GCTCCTGGCAGAGTTTTCTGTCGAGACAGAAGCCGACAGCAGAATGGCACAGAATTTATC 8 5 



61 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 120 

1 1 I I I M M I I I I M II I I II I I M I I I I I I I I I | | | | | | | | M I I I I I II M I | | I I I 
86 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGGATTTTA 145 

121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | I I I I II I I I | I II 
14 6 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGGGGTGTTTGGGTACCTGTT 205 

181 CTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCTTTCCATCTCTGACTT 240 

I I M I I I I I I M I I I I I I I M I II M I I I I I I I I I I I | | | | | | | M | | | | | | | | | | | | | | 
206 CTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCTTTCCATCTCTGACTT 2 65 

241 TGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAATGATAAGGGGACCTA 300 

1 I I I M I I I I I I I I I I II I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I 

266 TGCTTTCCTGGGCACCCTTCCCATCCTGATAAAGAGTTTTGCCAATGATAAGGGGACCTA 325 

301 TGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTCACACCAACCTCTACACCAGCAT 360 



iilLLLLL 1 ' I M I I I I I I I I I I I I I I I M I I I I I I I I I I || I I | I I I 

lT 385 



32 6 TGGAGATGTTCTTTGGATAAGCAACCGATATGGGCTTAACACCAACCTTTAAACCAGCA' 



361 CCTCTTCCTCACTTTCATTAGCATGGACCGATATCTGCTCATGAA — GTACCCTTTCCGA 418 
' ' HI III I I I I | | || | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 

38 6 CTTTTTCTTCATTTTCATTAGCATGGACCGATATCTGCTCATGAAAGTACCCTTTTCCGA 445 

419 GAACAC-TTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTT 4 77 

1 I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | || | | | | | | | | 

GAACACTTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTT 505 

AGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCCAAAAGAAGAGGG 537 

I MINIMI II II I I I I | | | | | | | | | | || | | | | | | | | | | | | | | | | | 

AGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCCAAAAGAAGAGGG 565 



446 
478 
506 



538 CAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAACACAATCTCATTTACAGCCT 597 

I I I I I I I I I I I I I I I I I I I I I | | | || | | | | | | | | | | | | | | | | | | | | | | | | | , | 

C AGTAACT G CAT C GAC TAT GCAAGT T CT GGAAAC C C TGAAC AC AAT CT C ATT T AC AGC CT 625 



566 



598 CTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAA 657 
I I I I I I I I M I I I I I I I I I I I | | | I I I I I I I I I I I I I I I I I I I | | | I I I I I I I I I I I I I I 
CTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAA 685 



62 6 
658 



GATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAA 717 
I I I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
686 GATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAA 745 

718 ACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTTCACACCCTATCA 777 

I I I I I I I I I I I I I II II I I I I I I I I II I I I I I I I | I I I I I I 

746 ACCCCAACGCCTGGTGGTCCTGGCAGTTGTGATCTTCTCTATACTCTTCACACCCTATCA 805 

778 TATCATGCGCAATTTGAGGATCGCCTCACGCCTGGATAGTTGGCCACAAGGATGTACACA 837 

N I I I I M I I I I I I I I I I I | M I I I I I I I I I I I I I I I I I I I I I I | | M 

806 TATCATGCGCAATTTGAGGATCGCCTCACGCCTGGATAGTTGGCCACAAGGATGTACACA 865 

838 GAAGGCCATCAAATCTATATACACACTGACACGGCCTCTGGCCTTTCTGAACAGTGCCAT 897 

I I I N I I I N I I I I I I I I II I I M | | | | | | | | | | | | M I I I I I I I II I I | | | | | | | | | | | 
866 GAAGGCCATCAAATCTATATACACACTGACACGGCCTCTGGCCTTTCTGAACAGTGCCAT 925 



Qy 8 98 CAAT C C CAT CT T C TACT T C C T CAT G GGAGAC C ATT AC AG AG AG AT G C T GAT T AGTAAGT T 957 

I I I I I I I I I I I I I | | | | | | | | | | | | | | || I I M I I I I II Ill 

Db 926 CAAT C C CAT C T T CT AC T T C C T CAT GG GAG AC CAT T ACAGAGAGAT GCT GAT T AGTAAGT T 985 

Q y 958 C AG AC AAT AC T T CAAGT C C CT T ACAT C C T T C AG GACAT GAGC T G CT G GAT G C AGGT CT T C 1017 

I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | || | | | | ( | 

Db 986 CAGACAAT AC TT CAAGT C C CT T AC AT C CT T C AG GACAT GAGCT G C T G GAT GC AGGT C T T C 1045 



Qy 1018 AC T C AGC CAAAAT GAGAC AC T T GATAAACAGT GCT GT GC AGT T GAGTT T T AAC TAAGTAA 1077 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | |M 

ACT C AGC CAAAAT GAGAC ACT T GATAAACAGT G CT GT GC AGT T GAGTT T T AAC TAAGTAA 



Db 1046 



1105 



Qy 1078 ACCACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAG 1137 

N I I I II I II I I I I I I II I I II I I I I I II I II I I I I I I I I I I I II I I I I I M I I I I I I I I 
Db 1106 AC C AC CAT T T CTAGGCT T T AGCT T T C C AC CAT C C T C CAAC C C C C AG G GCT G GAGT ACAAG 1165 

Qy 1138 C T GGGT C C AC AT GAAT C AGAAGG C AG CTCTCTGTTCT GATT T T AGGT TAT AC C C AGAGT A 1197 

I I I I I I II I I I M II I II I I I I I I I | | | || | | | | | || | | || | || || | || | | | | 

Db 1166 CTGGGTCCACATGAATCAGAAGGCAGCTCTCTGTTCTGATTTTAGGTTATACCCAGAGTA 1225 

Qy 1198 T GGAAAAAATAAGGCAT GAGAAAG CAT T GACAT CTT CACTTAAGAACT GAACAAAAGAGA 1257 

1 I II I MM I I I I | | | M | | M II I I M I II II I I I II I M I I 

Db 1226 T GGAAAAAATAAGGCAT GAGAAAGC AT T GACAT CTT CAC T TAAGAACT GAACAAAAGAGA 1285 



Qy 1258 ACAAAT AT T GT CAAT GTT T GGAC ACT T AG GAT CT GAAAT CTT GGAAAT T T TAAGAC CT CT 1317 

I I I I M M I I II I II I I I M II I I I I M I | II I II I I I M II I I II II I I M II II M I I 

ACAAAT ATT GT CAAT GTTT GGACACTTAGGAT CT GAAAT CTTGGAAATTTTAAGACCT CT 



Db 1286 



1345 



Qy 1318 T TT T CT AT C AGT GT AAAAGGAAT ACAAGAT AGCTAGT T GCAAAT GCT GAAT G CAT T T CAT 1377 

'III' I I I I I II II II M M I I II I M I I M II M I I II I II II II I I I I I I 

T T T T C TAT C AGT GT AAAAG GAAT ACAAGAT AG C TAGTT GCAAAT GCT GAAT G CAT TT CAT 



Db 1346 



1405 



Qy 137 8 CATT GGT CAGGT C GATAAGC GT GTT T C T GAAAT AGT CT T AT T T TT AT T C T T GTAATAT T A 1437 

"111111111111 I I M I I II I I I I I I I II I I M I I I I I I I I I I I I II 

CAT T GGT CAGGT C GAT AAGC GT GTTT CT GAAAT AGT CTT AT TT T TAT T CT T GTAATAT T A 1465 



Db 1406 



Qy 1438 AAAT T TAT GT GAAAAAT GAAT ATAAT T CAAT GT ACAACAT TAGAT TT T CT AT T T GAAAAT 14 97 

' I I I I I I I I I M I I I I I I M II I M M I M I M M I I I II M I I II II I I I II M M I I I 

AAAT T TAT GT GAAAAAT GAAT AT AAT T CAAT GT AC AAC AT TAGAT T T T C TAT T T GAAAAT 1525 



Db 1466 



Qy 14 9 8 TATATTT CTT GAAAAAATAACT GCT GT GC CT AAAT AAAT CAAT AT A 1543 

I I N M I I II I I I I || I | | | | | | | | | | | M I I II I I I I || | 

Db 152 6 TAT AT T T CT T GAAAAAATAACT G C T GT G C C T AAAT AAAT CAAT AT A 1571 



RESULT 2 
BB323771 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BB323771 683 bp mRNA linear EST 31-AUG-2001 

BB323771 RIKEN full-length enriched, 4 days neonate male adipose 
Mus musculus cDNA clone B430012O21 3', mRNA sequence 
BB323771 

BB323771.2 GI:15411432 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 
1 (bases 1 to 683) 

Arakawa,T., Carninci,P., Fukuda,S., Furuno,M., Hanagaki,T., 
Hara,A., Hiramoto, K. , Hori,F., Ishii,Y., Ito,M. , Kawai,J., 
Konno,H., Kouda,M., Koya,S., Matsuyama, T . , Miyazaki,A. , Nomura, K., 
Ohno,M., Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagami,M., Tagawa,A., Takahashi, F. , 
Takeda,Y., Tanaka,T., Toya,T., Muramatsu,M. and Hayashizaki , Y . 
RIKEN Mouse ESTs (Arakawa,T., et al . 2001) 
Unpublished (2001) 

On Jul 11, 2000 this sequence version replaced gi: 9032085. 
Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 
Sciences Center (GSC) , Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome-res @gsc . riken. go . jp, 

URL : http : / / genome . gs c . r i ken . go . j p/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M, and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M. , Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y . , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki, Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 

Yamanaka,I., Kiyosawa,H., Kondo,S., Saito,T., Shinagawa, A. , 
Aizawa,K., Fukuda,S., Hara,A., Itoh,M. , Kawai,J., Shibata,K., 
Arakawa,T., Ishii,Y. and Hayashizaki, Y. 

Mapping of 19032 mouse cDNAs on mouse chromosomes. J. Struct. 
Func. Genomics 2 pre, L72-L86 (2001) 

Please visit our web site (http://genome.gsc.riken.go.jp/) for 
further details. 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Location/Qualifiers 

1. .683 

/organism="Mus musculus" 
/rtiol_type="mRNA" 
/db_xref="taxon: 10090" 
/clone="B430012021 M 
/sex="male" 



/tissue_type= "adipose" 
/dev_stage="4 days neonate" 
/lab_host="DH10B" 

/clone_lib="RIKEN full-length enriched, 4 days neonate 
male adipose" 

/note="Site_l: Sail; Site_2: BamHI; cDNA library was 
prepared and sequenced in Mouse Genome Encyclopedia 
Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in 
RIKEN. Division of Experimental Animal Research in Riken 
contributed to prepare mouse tissues. 1st strand cDNA was 
primed with a primer [5 1 

GAGAGAGAGAAGGAT C CAAGAGCT CTTTTTTTTTTTTTT T T VN 3'], cDNA was 
prepared by using trehalose thermo-activated reverse 
transcriptase and subsequently enriched for full-length by 
cap-trapper. cDNA went through one round of normalization 
to Rot =10.0 and subtraction to Rot = 22 9.0. Second 
strand cDNA was prepared with the primer adapter of 
sequence [5 T G AGAGAGAG AT T CT C GAGT T AAT T AAAT T AAT CCCCCCCCCCCCC 
3']. cDNA was cleaved with Xhol and BamHI. Vector: a 
modified pBluescript KS (+) after bulk excision from Lambda 
FLC I." 

ORIGIN 



Query Match 36.3%; Score 560; DB 10; Length 683; 

Best Local Similarity 98.7%; Pred. No. 4.2e-124; 

Matches 596; Conservative 0; Mismatches 5; Indels 3; Gaps 3; 



Qy 


943 


GCT GAT T AGTAAGT T C AGAC - AAT ACT T CAAG - T C C CT T AC AT C CTT C - AG GAC AT GAGC 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | I I | | | M II 

GC T GAT T AGTAAGT T C AGCCAAAT AC T T CAAGT T CC C T T AC AT C CTT CAAG GAC ATAAGT 


999 


Db 


66 


125 


Qy 


1000 


T G C T GGAT GC AG GT CTT C ACT CAG C CAAAAT GAGACAC T T GATAAACAGT GCT GT GC AGT 

N 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I M M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I M 1 1 

T G CT GGAT GC AG GT T T T CACT C AG C CAAAAT GAGACAC T T GATAAACAGT GCT GT GC AGT 


1059 


Db 


126 


185 


Qy 


1060 


TGAGTTTTAACTAAGTAAACCACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCC 

1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | M 1 II 

T GAGT T T T AAT TAAGTAAAC CAC CAT T T C T AGG CT T TAGCTT T C C AC CAT C CT C CAAC C C 


1119 


Db 


186 


245 


Qy 


1120 


CCAGGGCTGGAGTACAAGCTGGGTCCACATGAATCAGAAGGCAGCTCTCTGTTCTGATTT 
1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II | | | | | | | | | | | M II 1 1 1 II 1 II 1 II || | | | || | | | 
CCAGGGCTGGAGTACAAGCTGGGTCCACATGAATCAGAAGGCAGCTCTCTGTTCTGATTT 


1179 


Db 


246 


305 


Qy 


1180 


TAGGTT AT AC C C AGAGT AT GGAAAAAAT AAG GC AT GAGAAAG CAT T GAC AT CTT CACT T A 

1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | M | | | || | | | | | | | | 

TAGGTT AT ACCCAGAGTAT GGAAAAAAT AAGGCAT GAGAAAGCATT GACAT CTT CACTTA 


1239 


Db 


306 


365 


Qy 


1240 


AGAACT GAACAAAAGAGAACAAAT AT T GT C AAT GT T T GG ACAC T T AGGAT C T GAAAT CT T 

N 1 1 1 1 1 1 1 II II 1 II 1 1 1 II 1 1 1 1 1 1 II 1 II | || | | | | | | | || | | | | | | | | | | | | | | | | 

AGAACT GAACAAAAGAGAACAAAT AT T GT C AAT GT T T GGACAC T T AGG AT CT GAAAT CT T 


1299 


Db 


366 


425 


Qy 


1300 


GGAAAT T T TAAGACC T C T T T T T CT AT C AGT GT AAAAGGAAT ACAAGAT AGCT AGT T GC AA 

1 1 N 1 1 1 1 M 1 M 1 1 1 M 1 M 1 1 1 M 1 1 II 1 1 1 II 1 1 1 II 1 1 1 II I I | 

G GAAAT T T TAAGACC T C T T T T T CT AT CAGT GT AAAAGGAAT ACAAGAT AGC T AGT T GC AA 


1359 


Db 


426 


485 


Qy 


1360 


AT GCT GAAT G CAT T T CAT CAT T GGT CAG GT C GAT AAGC GT GT T T C T GAAAT AGT C T T ATT 


1419 



Db 48 6 AT G CT GAAT GC AT T T CAT CAT T GGT C AG GT C GATAAGC GT GT T T CT GAAAT AGT C T TAT T 545 

Qy 142 0 T T TAT T CT T GT AAT AT TAAAAT T TAT GT GAAAAAT GAAT ATAAT T CAAT GT ACAAC AT T A 147 9 

N I M I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I | | | | | | 

Db 54 6 T T TAT T C T T GT AAT AT TAAAAT T TAT GT GAAAAAT GAAT AT AAT T CAAT GT AC AAC AT T A 605 

Qy 148 0 GAT T T T C TAT T T GAAAAT TAT AT T T CT T GAAAAAATAACT GCT GT GC C T AAATAAAT CAA 153 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I | | | | || | | M M I I II 

Db 606 GAT T T T C TAT T T GAAAAT TAT AT T T CTT GAAAAAATAACT GCT GT G C C T AAATAAAT CAA 665 

Qy 1540 TATA 1543 

I I I I 

Db 666 TATA 669 



RESULT 3 
BX527630 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BX527630 556 bp mRNA linear EST 27-JUN-2003 

BX527630 Sugano mouse kidney mkia Mus musculus cDNA clone 
IMAGp998B194840 ; IMAGE : 197 022 6, mRNA sequence. 
BX527630 

BX527 630.1 GI: 32297360 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 556) 

Heil,0., Ebert,L., Neubert,P., Peters, M. , Radelof,U., Schneider, D. 
and Korn,B. 

Mouse UnigeneSet - RZPD2 
Unpublished (2003) 
Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany 
RZPD; IMAGp998B194840. 

RZPDLIB; I.M.A.G.E. cDNA Clone Collection; 
Mouse UnigeneSet - RZPD2 (RZPDLIB No. 981) 
http: / /www. rzpd.de/CloneCards/cgi- 

bin/showLib.pl.cgi/response?libNo=981 Contact: Ina Rolfs 
RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Heubnerweg 6, D-14059 Berlin, Germany 
Tel: +49 30 32639 101 
Fax: +49 30 32639 111 
www. rzpd . de 

This clone is available royalty-free from RZPD; 

contact RZPD (clone@rzpd.de) for further information. Seq primer: 
sugF, Primer sequence: CTTCTGCTCTAAAAGCTGCG. 

Location/Qualifiers 

1. .556 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL" 
/db_xref="taxon: 10090" 
/clone="IMAGp998Bl94 840 
/sex=" female" 



IMAGE: 1970226' 



/dev_stage="adult " 
/lab_host="DH10B" 

/clone_lib="Sugano mouse kidney mkia" 

/note="Organ: kidney; Vector: pMEl8S-FL3; Site_l: Drain 
(CACTGTGTG) ; Site_2 : Dralll (CACCATGTG) ; 1st strand cDNA 
was primed with an oligo(dT) primer 

[ATGTGGCCTTTTTTTTTTTTTTTTT] ; double-stranded cDNA was 
ligated to a Dralll adaptor [TGTTGGCCTACTGG] , digested 
and cloned into distinct Dralll sites of the pME18S-FL3 
vector (5' site CACTGTGTG, 3* site CACCATGTG). Xhol should 
be used to isolate the cDNA insert. Size selection was 
performed to exclude fragments <1.5kb. Library 
constructed by Dr. Sumio Sugano (University of Tokyo 
Institute of Medical Science) . Custom primers for 
sequencing: 5 T end primer CTTCTGCTCTAAAAGCTGCG and 3' end 
primer CGAC CT GCAGCT C GAGCACA . " 

ORIGIN 



Query Match 33.5%; Score 516.4; DB 13; Length 556; 

Best Local Similarity 99.8%; Pred. No. 1.3e-113; 

Matches 517; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


1 


G C T C C T GG C AGAGT TTTCTGTC GAGAC AGAAGCC GACAGC AGAAT GG CAC AGAATT T AT C 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | | M 1 1 1 1 1 1 1 1 I 1 1 1 | | | M II 

GCTCCTGG CAGAGT TTTCTGTC GAGAC AGAAG C C GACAGC AGAAT GGC AC AGAAT TT AT C 


60 


Db 


39 


98 


Qy 

Db 


61 
99 


T T GT GAGAAT T G GT T GGCAAC AGAGGC T AT CT T GAATAAGT AC T AC CT CT CT GC AT TT T A 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | M 1 II 1 1 1 1 1 1 1 1 II 1 1 II 
T T GT GAGAAT T GGT T GGCAAC AGAGGCT AT CT T GAATAAGT AC T AC CT CT CT GC AT T T T A 


120 
158 


Qy 


121 


TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 

1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M I I | | | | | | | | | | | | M 1 II 1 1 1 1 1 1 1 II 1 1 I || 1 

TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTTGGCTACCTCTT 


180 


Db 


159 


218 


Qy 


181 


CT G CAT GAAGAACT G GAACAGC AGCAAT GT CT AT C T TT T TAAC C T TT C CAT CT CT GAC T T 

1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | || | | || | | | | | | || | | | 

CT GC AT GAAGAACT G GAACAGC AGCAAT GT C TAT CT TT T TAAC C T TT C CAT CT C T GACT T 


240 


Db 


219 


278 


Qy 


241 


TGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAATGATT^AGGGGACCTA 

1 M 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | M 1 
TGCTTTCCTGT GCAC C CT T C C CAT C CT GATAAAGAGT TAT G C CAAT GATAAGGG GAC C T A 


300 


Db 


279 


338 


Qy 


301 


T G GAGAT GT T CT CT GT ATAAGCAAC C GAT AT GT G CT T CAC AC CAAC C T CT ACAC C AG CAT 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 

T GGAGAT GTT CT CT GT ATAAG CAAC C GAT AT GT G CT T CAC AC CAAC C T CT ACAC C AGC AT 


360 


Db 


339 


398 


Qy 


361 


CCTCTTCCT C ACT T T CAT T AGC AT GGAC C GAT AT CT G CT C AT GAAGT AC C CT T T C C GAGA 

1 1 1 1 1 1 1 M II 1 1 II 1 1 1 1 1 1 I I || | | | M | | | || | | || | | | | | | | | | | | | | | | | | | || | 

CCTCTTCCT CACT T T CAT T AGC AT GGAC C GAT AT C T GCT C AT GAAGT AC C CT T T C C GAGA 


420 


Db 


399 


458 


Qy 


421 


ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | I M II 1 1 

ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 


480 


Db 


459 


518 


Qy 


481 


GAC C T T AGAAGT T CT AC C CAT GCT C ACT T T CAT CAAT T 518 

1 M 1 1 1 1 1 M 1 1 1 1 1 I I 1 1 1 1 1 1 1 1 I I I I I I | | | | M | 

GAC C T T AGAAGT T CT AC C CAT GCT C ACT T T CAT CAAT T 556 




Db 


519 





RESULT 4 
AI663305 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



AI663305 520 bp mRNA linear EST 10-MAY-1999 

uk27cl0.yl Sugano mouse kidney mkia Mus musculus cDNA clone 
IMAGE: 1970226 5' similar to SW:P2YR_RAT P49651 P2Y PURINOCEPTOR 1 
;, mRNA sequence. 
AI663305 

AI663305.1 GI:4766888 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 520) 

Marra,M., Hillier,L., Kucaba,T., Martin, J. , Beck,C, Wylie,T., 
Underwood, K. , Steptoe,M., Theising,B., Allen, M. , Bowers, Y., 
Person, B., Swaller,T., Gibbons, M., Pape,D., Harvey, N. , Schurk,R., 
Ritter,E., Kohn,S., Shin,T., Jackson, Y., Cardenas, M. , McCann,R., 
Waterston,R. and Wilson, R. 
The WashU-NCI Mouse EST Project 1999 
Unpublished (1999) 
Other_ESTs: uk27cl0.xl 

Contact: Marra M/WashU-NCI Mouse EST Project 1999 
Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, USA 
Tel: 314 286 1800 
Fax: 314 286 1810 

Email: mouseest@watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
MGI: 986966 

Seq primer: custom primer used 
High quality sequence stop: 490. 

Location/Qualif iers 

1. .520 

/organism="Mus musculus" 
/mol_type= "mRNA" 
/strain="C57BL" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 1970226" 
/ sex="f emale" 
/ dev_stage="adult" 
/lab_host="DH10B" 

/ clone_lib="Sugano mouse kidney mkia" 

/note="0rgan: kidney; Vector: pME18S-FL3; Site_l : Drain 
(CACTGTGTG); Site_2 : Drain (CACCATGTG) ; 1st strand cDNA 
was primed with an oligo(dT) primer 

[ATGTGGCCTTTTTTTTTTTTTTTTT] ; double-stranded cDNA was 
ligated to a Drain adaptor [TGTTGGCCTACTGG] , digested 
and cloned into distinct Drain sites of the pME18S-FL3 
vector (5* site CACTGTGTG, 3' site CACCATGTG). Xhol should 
be used to isolate the cDNA insert. Size selection was 
performed to exclude fragments <1.5kb. Library 
constructed by Dr. Sumio Sugano (University of Tokyo 
Institute of Medical Science) . Custom primers for 



ORIGIN 



sequencing: 5' end primer CTTCTGCTCTAAAAGCTGCG and 3* end 
primer CGACCTGCAGCTCGAGCACA. " 



Query Match 32.1%; Score 495.8; DB 9; Length 520; 

Best Local Similarity 98.6%; Pred. No. 1.2e-108; 

Matches 500; Conservative 0; Mismatches 7; Indels 0; Gaps 0; 

Qy 1 GC T C CT GG C AGAGTT T T C T GT C GAGAC AGAAGC C GACAG C AGAAT G G C ACAGAAT T TAT C 60 

I I M I I I I I I I II I I I I II | | I | I | | | M II I I I I I i I II I I || | | | | | | | | || | | | | | 
Db 14 GC T C C T GG C AG AGTT T T C T GT C GAGAC AGAAG C C GACAGCT GAAT GG C AC AGAAT T TAT C 73 

QY 61 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 120 

I I M I I I I II I I I I I I I I I I I I I I I I I | | | | || | | | || I I I I I I I I I I I || I I | | | | | | | 
Db 74 T T GT GAGAAT T GGT T G G CAAC AGAG GCT AT CTT GAAT AAGT ACT AC C T CT C T G CAT T TT A 133 



Qy 121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 180 

N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 134 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTTGGCTACCTCTT 193 

Qy 181 C T G CAT GAAGAAC T G GAACAG CAGCAAT GT CT AT CT TT T TAAC C T T T C CAT CT CT GAC T T 240 

I M I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | M I I I II I I I I I I I M I I I I I I I I I I I 
Db 194 C T GC AT GAAGAACT G GAAC AGC AGCAAT GT C TAT C T TT T TAAC C T T T C CAT C T C T GAC T T 253 

Qy 241 TGCTTTCCTGTG CAC C CT T C C CAT C CT GAT AAAGAGTT AT GCC AAT GAT AAGGG GAC C T A 300 

I I I M M M I I I I II I II II I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | || | | | | || 
Db 254 TGCTTTCCT GT GC AC C C T T C C CAT C CT GAT AAAGAGT TAT G C CAAT GAT AAGGG GAC C T A 313 



Qy 301 T GGAGAT GTT CT CT GTATAAGCAAC CGATAT GTGCTT CACACCAACCT CTACACCAGCAT 360 

I I M I I M M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I || | | | | | 
Db 314 T GGAGAT GTT CT CT GTATAAGCAACC GATAT GT GCTT CACACCAACCT CTACACCAGCAT 373 

Qy 3 61 CCTCTTCCT CAC T T T CAT TAG CAT GGAC CGATAT CT GCT CAT GAAGT AC C CT T T C C GAGA 420 

I I I I I MINI I I I I I I I I I I I I I I I I I I || | | | | | | | | | | | | M I I I I I I II I I 
Db 374 GCTCTTGCT CACT GT CAT TAT CAT GGAC CGATAT C T G C T CAT GAAGT AC C CT GT C C GAGA 4 33 

Qy 421 ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 4 80 

M I I I I II I I I I I I I I I I I I I I I I II | || | | || | | | | | | | | | | | | | | | | | || | | | | | | | | 
Db 434 ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 493 

Qy 4 81 GACCTTAGAAGTTCTACCCATGCTCAC 507 

I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 4 94 GACCTTAGAAGTTCTACCCATGCTCAC 52 0 



RESULT 5 
BB744515 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BB744515 469 bp mRNA linear EST 16-OCT-2001 

BB744515 RIKEN full-length enriched, adult male kidney Mus musculus 
cDNA clone F530003I24 3', mRNA sequence. 
BB744515 
BB744515.1 
EST. 

Mus musculus 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 



GI: 16152351 
(house mouse) 



REFERENCE 1 (bases 1 to 4 69) 

AUTHORS Akimura,T. , Arakawa,T., Carninci,P., Furuno,M., Hanagaki,T., 
Hayatsu,N., Hiramoto, K. , Hiraoka,T., Hirozane, T . , Imotani,K., 
Ishii,Y., Ito,M., Kawai,J., Kojima,Y., Konno,H. f Kouda,M., 
Matsuyama,T. , Nakamura,M., Nishi,K., Nomura, K., Numasaki , R. , 
Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., Sakazume,N., 
Sasaki, D., Sato f K., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagawa,A. , Takahashi, F. , Takaku-Akahira, S . , 
Tanaka,T., Tomaru,A., Toya,T., Watahiki,A., Yasunishi , A. , 
Muramatsu,M. and Hayashizaki , Y . 

TITLE RIKEN Encyclopedia of Mouse Full-length cDNAs (Akimura, T . , et al . 

2001) 

JOURNAL Unpublished (2001) 
COMMENT Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 

Sciences Center (GSC) , Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome- res @gs c . riken . go . jp, 

URL: http : / / genome . gsc . riken . go . jp/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fuj iwake, S . , Inoue,K., Togawa,Y., Izawa,M., Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y. , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki, Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 
Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details, 
e mouse tissues. 
FEATURES Location/Qualifiers 
source 1. .469 

/organism-"Mus musculus" 
/mol_type="mRNA" 
/db_xref="taxon: 10090" 
/clone="F530 003124" 
/ sex="male" 
/tissue_type="kidney" 
/dev_stage=" adult" 
/lab_host="SOLR" 

/clone_lib="RIKEN full-length enriched, adult male kidney" 
/note="Site_l: Xhol; Site_2 : SstI; cDNA library was 
prepared and sequenced in Mouse Genome Encyclopedia 
Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in 



RIKEN. Division of Experimental Animal Research in Riken 
contributed to prepare mouse tissues. 1st strand cDNA was 
primed with a primer [5 ? 

GAGAGAGAGAGCGGCCGCAACTCGAGTTTTTTTTTTTTTTTTVN 3'], cDNA was 
prepared by using trehalose thermo-activated reverse 
transcriptase and subsequently enriched for full-length by 
cap-trapper. Second strand cDNA was prepared with the 
primer adapter of sequence [5' 

GAGAGAGAGAAG GAT C C AAGAG C T C AAT T AAT T AAT T AAAC CCCCCCCCCC 3 ' ] . 
cDNA was cleaved with Xhol and Sstl. " 

ORIGIN 

Query Match 29.5%; Score 455; DB 10; Length 469; 

Best Local Similarity 100.0%; Pred. No. 8.5e-99; 

Matches 4 55; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1089 TAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTGGGTCCACA 114 8 

I I I M ! I I I I I I I I I I I I I I I I I I I I I M | M I I I I I I I I I I I I I I I I I I I I I I I I | I I | 
Db 1 TAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTGGGTCCACA 60 

Qy 114 9 T GAAT CAGAAG GCAGCT CT CT GT T CT GAT T T TAG GTT AT AC C C AGAGT AT GGAAAAAAT A 120 8 

I M I I I I I I I I I I I I I I I I I I I I I I I || | | | | || | | M I I I I I I I II I I I I II II I I I || 
Db 61 T GAAT C AGAAGGCAGCT C T C T GT T CT GAT T T T AGGT TAT AC C CAGAGT AT GGAAAAAAT A 120 

Qy 1209 AGGCATGAGAAAGCATTGACATCTTCACTTAAGAACTGAACAAAAGAGAACAAATATTGT 1268 

I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 AGGCATGAGAAAGCATTGACATCTTCACTTAAGAACTGAACAAAAGAGAACAAATATTGT 18 0 

Qy 1269 CAATGTTTGGACACTTAGGATCTGAAATCTTGGAAATTTTAAGACCTCTTTTTCTATCAG 1328 

I I M I II I I I I I I I I I I I I I I I I I I I I I I | | | | M I I I I I I I II I I I I I I I I I I I M I I I 
Db 181 CAAT GT T T GGACACT T AGGAT CT GAAAT CT T GGAAATT T T AAGAC CT CT T T T T CT AT C AG 240 

Qy 132 9 T GT AAAAGGAATACAAGAT AGCT AGT T GCAAAT GCT GAAT GCAT T T CAT CAT T GGT C AGG 1388 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 241 T GTAAAAGGAATACAAGAT AGCT AGT T GCAAAT GCT GAAT GC ATT T CAT CAT T G GT C AGG 300 

Qy 1389 T C GAT AAGC GT GT T T C T GAAAT AGT C T TAT T T T TAT T CT T GTAAT ATT AAAAT T TAT GT G 144 8 

I I I M I I I I I I I I I I I I I I I I II || || I | I I | | | | | | M | | | | | | | | | | | | | | | | | | | | | 
Db 301 T C GATAAG CGTGTTTCT GAAAT AGT CT T AT T T T T ATTC T T GTAAT AT TAAAAT TT AT GT G 360 

Qy 1449 AAAAAT GAAT AT AAT T CAAT GT ACAAC AT T AGAT T T T CTAT TT GAAAATT AT AT TT CT T G 1508 

I I I M I I I I I I I I I I I I I I I I I I I I I I I | I | | | | | | | | || | | | | | | | | | | | | | M I I I I I 
Db 361 AAAAAT GAAT AT AAT T CAAT GTACAAC AT T AGAT T T T C T ATT T GAAAAT T AT AT TT CT T G 420 

Qy 1509 AAAAAATAACT GCT GT GCCT AAAT AAAT CAAT AT A 1543 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 421 AAAAAATAACT GCT GT G C C TAAATAAAT CAAT AT A 455 



RESULT 6 
BB746222 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 



BB746222 458 bp mRNA linear EST 15-OCT-2001 

BB746222 RIKEN full-length enriched, adult male kidney Mus musculus 
cDNA clone F530013P03 3 ! , mRNA sequence. 
BB746222 

BB746222.1 GI:16149159 
EST. 



SOURCE Mus mus cuius (house mouse) 

ORGANISM Mus mus cuius 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
REFERENCE 1 (bases 1 to 458) 

AUTHORS Akimura,T., Arakawa,T., Carninci,P., Furuno,M., Hanagaki,T., 
Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., Imotani,K., 
Ishii,Y., Ito,M., Kawai,J., Kojima,Y., Konno,H., Kouda,M., 
Matsuyama, T . , Nakamura,M., Nishi,K., Nomura, K., Numasaki,R., 
Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., Sakazume,N., 
Sasaki, D. , Sato,K., Shibata,K., Shinagawa,A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Tanaka,T., Tomaru,A., Toya,T., Watahiki,A., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y . 

TITLE RIKEN Encyclopedia of Mouse Full-length cDNAs (Akimura,T., et al. 

2001) 

JOURNAL Unpublished (2 001) 
COMMENT Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 

Sciences Center (GSC) , Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome-res @gsc . riken. go. jp, 
URL: http: //genome . gsc. riken. go . jp/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fuj iwake, S . , Inoue,K., Togawa,Y., Izawa,M. , Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. . 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y. , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki, Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 
Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details . 
e mouse tissues. 
FEATURES Location/Qualifiers 
source 1. .458 

/organism="Mus mus cuius" 
/mol_type="mRNA" 
/db_xref="taxon: 10090" 
/clone="F530013P03" 
/sex="male" 
/tissue_type= "kidney" 
/dev_stage="adult" 
/lab_host="SOLR" 

/clone_lib="RIKEN full-length enriched, adult male kidney" 



/note="Site_l : Xhol; Site_2: SstI; cDNA library was 
prepared and sequenced in Mouse Genome Encyclopedia 
Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in 
RIKEN. Division of Experimental Animal Research in Riken 
contributed to prepare mouse tissues. 1st strand cDNA was 
primed with a primer [5' 

GAGAGAGAGAGCGGCCGCAACTCGAGTTTTTTTTTTTTTTTTVN 3'], cDNA was 
prepared by using trehalose thermo-activated reverse 
transcriptase and subsequently enriched for full-length by 
cap- trapper. Second strand cDNA was prepared with the 
primer adapter of sequence [5 1 

GAGAGAGAGAAGGAT C CAAGAGCT CAATTAATTAATTAAAC CCCC CC C CC C 3 ' ] . 
cDNA was cleaved with Xhol and SstI. " 

ORIGIN 

Query Match 28.4%; Score 438; DB 10; Length 458; 

Best Local Similarity 99.8%; Pred. No. l.le-94; 

Matches 449; Conservative 0; Mismatches 0; Indels 1; Gaps 1; 

Qy 1058 GT T GAGT T T T AACT AAGT AAAC CAC C AT TT CT AGG CT T T AGC T T T C C AC CAT C C T CC AAC 1117 

I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 10 GT T GAGT T T TAACTAAGTAAAC CAC CAT T T CT AGGCTT T AGC T T T CC AC CAT C CT C C AAC 69 

Qy 1118 C C C C AG GGCT G GAGT ACAAG C T G GGT C CAC AT GAAT CAGAAGGC AGCT CT CT GT T C T GAT 1177 

I I M I I I I I I I M II I I I I I I I I I I I I || | | || | | | | | || | | | | | | | | | | | | | | | | | | | | 
Db 7 0 CCCCAGGGCTGGAGTACAAGCTGGGTCCACAT GAAT CAGAAGGCAGCTCTCTGTTCT GAT 129 

Qy 1178 T T T AG GT T AT AC C C AG AGT ATGGAAAAAAT AAGG C AT GAGAAAG CAT T GACAT CT T C ACT 1237 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I | | | | | | | | | | | | | | | 
Db 130 T T T AGGT TAT AC C C AGAGT AT GGAAAAAATAAG GC AT GAGAAAGC AT T GACAT CT T CAC T 189 

Qy 1238 TAAGAACT GAACAAAAGAGAACAAAT AT T GT CAAT GTT T GGAC AC T T AGGAT C T GAAAT C 1297 

I I I I I M I I I I I 1 M I I I I I I I I I I I I I I I I I I I II I I M I I I I M I M I I II I I I I I I 
Db 190 T AAG- AC T GAACAAAAGAGAACAAAT AT T GT CAAT GT T T GGAC ACT T AGGAT CT GAAAT C 248 

Qy 1298 T T GGAAAT T T T AAGAC CTCTTTTT CT AT CAGT GTAAAAG GAAT ACAAGAT AG CT AGTT GC 1357 

I I I I I I I I I I I I M I I I I II I I I II I I I I I II I I I M I I I I I I II I I I I I I II I I I I I I I 
Db 24 9 TT GGAAAT T T TAAGACCT CT TT T T CT AT CAGT GTAAAAGGAATACAAGAT AGCT AGT T GC 308 

Qy 1358 AAAT G CT GAAT G C AT TT CAT CAT T GGT C AGGT C GAT AAG C GT GT T T CT GAAAT AGT CTT A 1417 

I M II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I II I I I I I I I II II 
Db 309 AAAT GCT GAAT G CAT T T CAT CAT T GGT C AG GT C GAT AAGC GT GT T T CT GAAAT AGT CTT A 368 

Qy 1418 TT T T TAT T C T T GTAAT AT T AAAAT T TAT GT GAAAAAT GAAT ATAAT T CAAT GT ACAAC AT 1477 

I I I I I M I I I I I I I I I I I I I I I I I I II I I I I II I I II I I I I I I I I I I I I I I II I I I I I I I 
Db 369 T T T T TAT T CT T GTAAT AT T AAAAT TT AT GT GAAAAAT GAAT AT AAT T CAAT GT ACAAC AT 428 

Qy 1478 T AGAT T T T C TAT T T G AAAAT TAT AT T T CT T 1507 

I I I I I I I I I I I II I II I I I I I I I I I II I I I 
Db 429 T AGAT T T T CT AT T T GAAAAT TAT AT T T C T T 458 



RESULT 7 
BB738743 

LOCUS BB738743 428 bp mRNA linear EST 15-OCT-2001 

DEFINITION BB738743 RIKEN full-length enriched, 6 days neonate spleen Mus 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



musculus cDNA clone F430109C18 3', mRNA sequence. 
BB738743 

BB738743.1 GI:16141748 
EST. 

Mus musculus (house mouse). 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 428) 

Akimura,T., Arakawa,T., Carninci,P., Furuno,M, , Hanagaki,T., 
Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., Imotani,K., 
Ishii, Y. Ito,M. , Kawai,J., Kojima,Y., Konno,H., Kouda,M., 
Matsuyama,T., Nakamura,M., Nishi,K., Nomura, K., Numasaki,R., 
Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., Sakazume,N., 
Sasaki, D., Sato,K., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Tanaka,T., Tomaru,A. , Toya,T., Watahiki,A., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y. 

RIKEN Encyclopedia of Mouse Full-length cDNAs (Akimura,T., et al . 
2001) 

Unpublished (2001) 

Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 

Sciences Center (GSC) , Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome- res @gs c . riken . go . jp, 

URL : http : / / genome . gs c . riken . go . j p/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M. , Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y . , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki, Y . 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 

Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details. 

e mouse tissues. 

Location/Qualifiers 
1. .428 

/organism="Mus musculus" 
/mol_type= ,, mRNA" 
/strain="C57BL/6J" 
/db_xref= M taxon: 10090" 
/clone="F430109C18" 



/tissue_type=" spleen" 
/dev_stage="6 days neonate" 

/clone_lib="RIKEN full-length enriched, 6 days neonate 
spleen" 

ORIGIN 

Query Match 26.8%; Score 414; DB 10; Length 428; 

Best Local Similarity 100.0%; Pred. No. 6.5e-89; 

Matches 414; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1130 AGT ACAAG C T G GGT C C AC AT GAAT CAGAAGGCAGC T CTCTGTTCT GAT T T T AGGT TAT AC 118 9 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I II I I I I I I I I | I I I 
Db 1 AGT ACAAGC T GG GT CC AC AT GAAT CAGAAGG CAGCT C T CT GTT CT GAT T T T AGGTTAT AC 60 

Qy 119 0 C C AGAGT AT G GAAAAAATAAGGCAT GAGAAAGC AT T GACAT CT T C ACT T AAGAACT GAAC 1249 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I | | | | 
Db 61 C C AGAGT AT GGAAAAAAT AAG GC AT GAGAAAGC AT T GACAT CT T C ACT TAAGAACT GAAC 12 0 

Qy 1250 AAAAGAGAACAAAT AT T GT CAAT GT T T G GAC ACT TAG GAT CT GAAAT C T T GGAAATTT T A 1309 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 AAAAGAGAACAAAT AT T GT CAAT GT T T GGAC ACT T AGGAT C T GAAAT CT T GGAAAT T T T A 180 

Qy 1310 AGAC CT C TT T T T CT AT CAGT GTAAAAGGAAT ACAAGAT AG CT AGT T GCAAAT GCT GAAT G 1369 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I II I I I I I I I I I I I I I I I I | | | | 
Db 181 AGACCT CTTTTT CT AT CAGT GTAAAAG GAAT ACAAGAT AGCTAGTT GCAAAT GCT GAAT G 240 

Qy 1370 CAT T T CAT CAT T G GT C AGGT C GATAAGC GT GT T T CT GAAAT AGT C T T ATT T T TAT T CT T G 1429 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I II 
Db 241 CAT T T CAT CAT T GGT CAG GT C GAT AAG C GT GT T T CT GAAAT AGT CT TAT T T T TAT T CT T G 300 

Qy 1430 TAAT AT TAAAAT T TAT GT GAAAAAT GAAT AT AAT T CAAT GT ACAAC AT T AGATTT T CT AT 14 8 9 

I I I M I I I I I M II I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 301 TAAT AT T AAAAT TT AT GT GAAAAAT GAAT AT AAT T CAAT GT ACAACAT T AGATT T T CT AT 360 

Qy 14 90 T T GAAAAT TAT AT T T CT T GAAAAAAT AACT GCT GT GC C TAAATAAAT CAAT AT A 1543 

I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 361 T T GAAAAT TAT ATT T CT T GAAAAAAT AACT GCT GT GC CT AAAT AAAT CAAT AT A 414 



RESULT 8 
BB847918 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



BB847918 422 bp mRNA linear EST 26-NOV-2001 

BB847918 RIKEN full-length enriched, adult male kidney Mus musculus 
cDNA clone F530201F11 5', mRNA sequence. 
BB847918 

BB847 918 . 1 GI : 1708 6293 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 422) 

Akimura,T., Arakawa,T., Carninci,P., Furuno,M., Hanagaki,T., 
Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., Imotani,K., 
Ishii,Y., Ito,M., Kawai,J., Kojima,Y., Konno,H., Kouda,M. , 
Matsuyama, T . , Nakamura,M., Nishi,K., Nomura, K. , Numasaki,R., 
Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., Sakazume, N . , 



Sasaki, D. , Sato,K., Shibata,K., Shinagawa,A. , Shiraki,T., 
Sogabe,Y., Suzuki, H . , Tagawa,A. , Takahashi, F. , Takaku-Akahira, S . , 
Tanaka,T., Tomaru.A., Toya,T., Watahiki,A., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y . 
TITLE RIKEN Encyclopedia of Mouse Full-length cDNAs (Akimura, T . , et al . 

2001) 

JOURNAL Unpublished (2001) 
COMMENT Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 

Sciences Center (GSC) , Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome- res @gs c . riken. go . jp, 

URL : http : / / genome . gsc . riken . go . jp/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M. , Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y . , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki, Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 
Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details . 
e mouse tissues. 
FEATURES Location/Qualifiers 
source 1. .422 

/organism="Mus musculus" 

/mol_type="mRNA" 

/db_xref="taxon: 10090" 

/clone="F5302 01Fll" 

/sex="male" 

/ tissue_type="kidney" 

/dev_stage="adult" 

/lab_host="SOLR" 

/clone_lib="RIKEN full-length enriched, adult male kidney" 
/note="Site_l: Xhol ; Site_2: SstI; cDNA library was 
prepared and sequenced in Mouse Genome Encyclopedia 
Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in 
RIKEN. Division of Experimental Animal Research in Riken 
contributed to prepare mouse tissues. 1st strand cDNA was 
primed with a primer [5 ! 

GAGAGAGAGAGCGGCCGCAACTCGAGTTTTTTTTTTTTTTTTVN 3 1 ] , cDNA was 
prepared by using trehalose thermo-activated reverse 
transcriptase and subsequently enriched for full-length by 



cap-trapper. Second strand cDNA was prepared with the 
primer adapter of sequence [5 ! 

G AG AGAG AGAAG GAT C C AAG AG C T CAAT T AAT T AAT T AAAC CCCCCCCCCC 3 ' ] . 
cDNA was cleaved with Xhol and Sstl. " 

ORIGIN 

Query Match 26.2%; Score 403.8; DB 10; Length 422; 

Best Local Similarity 99.5%; Pred. No. 1.9e-86; 

Matches 405; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 GCT C CT GGCAGAGTTTT CT GT C GAGACAGAAGC CGACAGCAGAATGGCACAGAATTTAT C 60 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I! I I I M I 
Db 16 G C T C C T GG C AG AGT T T T C T GT C GAGAC AGAAG C CGACAGC AGAAT GG C AC AGAAT T TAT C 75 

Qy 61 T T GT GAGAAT T GGTT GGC AAC AGAG G CT AT CT T GAATAAGT ACT AC CT CT C T GC ATT T T A 120 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I || | | | | | | | | | | | | | | | || | || | | | | | || 
Db 7 6 T T GT GAGAAT T G GTT GGCAAC AGAGGC T AT C T T GAATAAGT ACT AC CT C T CT GCATT T T A 135 

Qy 121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 180 

I I M I I I II I I I I I I I I I I II I I II I I | I | | | | M | | | | | || | || | | | I I I I I I I I I I I 
Db 13 6 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTTGGCTACCTCTT 195 

Qy 181 CT G CAT GAAGAACT G GAAC AG CAGC AAT GT CT AT C T T T T TAAC C TTT C C AT CT C T GAC T T 24 0 

I I I I I I M I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I | | | | | | || | | || 
Db 196 C T GCAT GAAGAACT GGAACAGCAGCAAT GT C TAT CTT T T T AAC CT TT C C AT CT CT GAC T T 255 

Qy 241 TGCTTTCCTGTG CAC C C T T C C CAT C C T GAT AAAGAGT TAT GC CAAT GATAAGGGGAC C T A 300 

I I I I M I I I M I I II I I I I I I I I I I I I I I I I II || I I I I I I I I | | | | | | | | | | | I | | | | | 
Db 256 TGCTTTCCTGT GCAC C CT T C C CAT C CT GAT AAAGAGT TAT GC CAAT GAT AAG GGGAC CT A 315 

Qy 301 T G GAGAT GT T C T CT GT AT AAGCAAC C GAT AT GT GCTT C ACAC C AAC C T C T AC AC C AG CAT 3 60 

I I I I I I I I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I II II I I I I I I I I I I 
Db 316 T GGAGAT GT T CT CT GT ATAAG CAAC C GAT AT GT G CTT CACAC CAAC CT CT AC AC CAGC AT 375 

Qy 361 CCTCTTCCTCACTTTCATTAGCATGGACCGATATCTGCTCATGAAGT 407 

I I I I I I I I I I II II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 376 CCTCTTCCT CAC TTT CAT T AGCAT GGAC C GAT AT CT GGT CAT GAAGT 422 



RESULT 9 
BB864882 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



BB864882 420 bp mRNA linear EST 09-JUL-2003 

BB864882 RIKEN full-length enriched, RCB-1283 B16 melanoma cDNA Mus 
musculus cDNA clone G430047C11 5 ! , mRNA sequence. 
BB864882 

BB864 8 82 .1 GI : 17111092 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 420) 

Akimura,T., Arakawa,T., Carninci,P., Furuno,M., Hanagaki,T., 
Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., Imotani,K., 
Ishii,Y., Ito f M., Kawai,J., Kojima,Y., Konno,H., Kouda,M., 
Matsuyama,T. , Nakamura,M., Nishi,K., Nomura, K. , Numasaki,R., 
Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., Sakazume,N., 



TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



Sasaki, D . , Sato,K., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagawa, A. , Takahashi , F. , Takaku-Akahira, S . , 
Tanaka,T., Tomaru,A. , Toya,T., Watahiki,A. , Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y . 

RIKEN Encyclopedia of Mouse Full-length cDNAs (Akimura,T., et al . 
2001) 

Unpublished (2001) 

Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 

Sciences Center (GSC), Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome- res @gs c . riken . go . jp, 

URL : http : / / genome . gsc . riken . go . jp/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y . , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki, Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 

Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details. 

e mouse tissues. 

Location/Qualifiers 
1. .420 

/organism="Mus mus cuius" 
/ mo 1_ t yp e = " mRNA" 
/strain="C57BL/6" 
/db_xref="taxon: 10090" 
/clone="G430047Cll" 
/ tissue_type="skin" 

/cell_line="RCB-1283 B16 melanoma" 

/clone_lib="RIKEN full-length enriched, RCB-1283 B16 
melanoma cDNA" 



ORIGIN 



Query Match 25.2%; 
Best Local Similarity 99.5%; 
Matches 400; Conservative 



Score 388.4; DB 10; 
Pred. No. 9.7e-83; 
0; Mismatches 1; 



Length 420; 



Indels 



1; Gaps 



1; 



Qy 

Db 



1 G C T C CT GG C AGAGT T TT CT GT C GAGACAGAAG C C GACAG C AGAAT GGC AC AGAAT T TAT C 60 

M I I I I I i I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

19 GCT C C T G GC AGAGT T T T CT GT C GAGACAGAAG C C GACAG C AGAAT G G C AC AGAAT T TAT C 78 



Qy 61 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 120 

M I i I I I I II I I I I I I I I I M | | | | | | | | | | | M I I I I I I I I I I I I I I I I I I I I I I | I I I 
Db 79 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 138 

Qy 121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 18 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 139 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTTGGCTACCTCTT 198 

Qy 181 CTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCTTTCCATCTCTGACTT 24 0 

I I I I I I I I I I I I I I I I I M I I II I I I I I I I I I I I I I I | I I I I I | | | | | | | I M | | | | | | | 
Db 19 9 C T G CAT GAAGAAC T GGAAC AGC AGCAAT GT C T AT CT T T T TAAC C T T T C C AT CT C T GACT T 258 

Qy 241 TGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAATGATAAGGGGACCTA 300 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 25 9 TGCTTTCCT GT GC AC C CT T C C CAT C CT GATAAAGAGT TAT GC CAAT GATAAGGGGAC CT A 318 

Qy 301 T GGAGAT GT T CT CT GT ATAAGCAAC C GAT AT GT G C T T C AC AC CAAC C T CT AC AC C AGC AT 360 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I II I I I I I I I I 
Db 319 T GGAGAT GTT CT CT GT ATAAGCAAC CGAT AT GTGCTT CACACCAACCT CT ACAC CAGCAT 378 

Qy 361 CCTCTTCCTCACTTTCATTAG-CATGGACCGATATCTGCTCA 401 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 379 CCTCTTCCT C ACT T T CAT TAG C C AT GGAC C GATAT CT GCT C A 42 0 



RESULT 10 

BB778587 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
COMMENT 



BB778587 426 bp mRNA linear EST 08-JUL-2003 

BB778587 RIKEN full-length enriched, RCB-1283 B16 melanoma cDNA Mus 
musculus cDNA clone G430047C11 3', mRNA sequence. 
BB778587 

BB77 858 7. 1 GI: 16939287 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 426) 

Akimura,T., Arakawa,T., Carninci,P., Furuno,M., Hanagaki,T., 
Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., Imotani,K., 
Ishii,Y., Ito,M., Kawai,J., Kojima,Y., Konno,H., Kouda,M., 
Matsuyama,T. , Nakamura,M. , Nishi,K., Nomura, K., Numasaki,R., 
Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., Sakazume,N., 
Sasaki, D., Sato,K., Shibata,K., Shinagawa,A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagawa,A. , Takahashi, F* , Takaku-Akahira, S . , 
Tanaka,T., Tomaru,A. , Toya,T., Watahiki,A., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki , Y. 

RIKEN Encyclopedia of Mouse Full-length cDNAs (Akimura,T., et al. 
2001) 

Unpublished (2001) 

Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 
Sciences Center (GSC) , Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 



FEATURES 

source 



Email : genome-res@gsc . riken . go . jp, 
URL:http: / /genome. gsc. riken. go . jp/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y. , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki , Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 

Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details. 

e mouse tissues. 

Location/Qualifiers 
1. .426 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6" 
/db_xref="taxon: 10090" 
/clone= ,, G43004 7Cll" 
/ tissue_type-"skin ,f 

/cell_line="RCB-1283 B16 melanoma" 

/clone_lib="RIKEN full-length enriched, RCB-1283 B16 
melanoma cDNA" 



ORIGIN 



Query Match 24.9%; 
Best Local Similarity 98.8%; 
Matches 419; Conservative 



Score 384.8; DB 10; Length 426; 
Pred. No. 7.2e-82; 
0; Mismatches 2; Indels 3; 



Gaps 



3; 



Qy 1123 GGGCTGGAGTACAAGCTGGGTCCACATGAATCAGAAGGCAGCTCTCTGTTCTGATTTTAG 1182 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 3 GGGCTGGAGTACAAGCTGGGTCCACATGAATCAGAAGGCAGCTCTCTGTTCTGATTTTAG 62 



Qy 1183 GT TAT AC C CAGAGT AT G GAAAAAAT AA- G GC AT GAGAAAGC AT T GACAT CT T C AC T T AAG 1241 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | M | | | | | | | | | | 

Db 63 GT T AT AC C CAGAGT AT GGAAAAAAT AAG GGC AT GAAAAAGCATT GACAT C T T C ACT T AAG 122 

Qy 1242 AAC T GAACAAAAGAGAACAAAT AT T GT CAAT GT T T GGACACT T AGGAT CT GAAAT CT T GG 1301 

M I I II I I I II I I I I II I I I I I I | | | | | I M I I I I I I I I I I I I I I I I I I I I I M I II I I 

Db 12 3 AACT GAACAAAAGAGAACAAAT AT T GT C AAT GT TT GGACACT T AGGAT CT GAAAT C T T T G 182 

Qy 1302 AAAT T T T AAGAC CT C TT T T T CT AT C AGT GT AAAAG GAAT ACAAGAT AG CT AGT T G CAAAT 1361 

I I I M I I I N M I M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

Db 183 AAAT T T T AAGAC CT CTT T T T CT AT CAGT GTAAAAGGAAT ACAAGAT AGC TAGTT GCAAAT 242 



Qy 



1362 GC T GAAT GCAT T T CAT CAT T GGT C A- G GT C GAT AAGC GT GT T T C T GAAAT AGT CT T AT T T 142 0 
I I I I M I I I I I I I I I I | I I | | | | | | I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I 



Db 



243 GCTGAATGCATTTCATCATTGGTCACGGTCGATAAGCGTGTTTCTGAAATAGTCTTATTT 302 



Qy 1421 TTAT T CTT GTAATATTAAAATTTAT GT GAAAAAT GAATATAATT CAAT GT ACAACATT AG 14 8 0 

M I I I II I I I I I I I I I I I I I I I I | | | | | I | | | | | | | | | | | | | | | | | | | | | | | || | | | | | | 
Db 303 T TAT T CT T GT AAT AT TAAAAT T TAT GT GAAAAAT GAAT AT AAT T CAAT GT ACAACAT TAG 3 62 

Qy 1481 ATTTTCTA-TTTGAAAATTATATTTCTTG7VAAAAATAACTGCTGTGCCTAAATAAATCAA 1539 

I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | || I I I I I I I I I I II 
Db 363 AT T T T C T AGT T T GAAAAT T ATAT TT CT T GAAAAAAT AACT GCTGTGCC TAAAT AAAT CAA 422 

Qy 1540 TATA 1543 

I I I I 

Db 423 TATA 426 



RESULT 11 

BB739482 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
COMMENT 



BB739482 396 bp mRNA linear EST 15-OCT-2001 

BB739482 RIKEN full-length enriched, 6 days neonate spleen Mus 
musculus cDNA clone F430113M16 3*, mRNA sequence. 
BB739482 

BB7394 82. 1 GI : 16142487 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 396) 

Akimura,T., Arakawa,T., Carninci,P., Furuno,M., Hanagaki,T., 
Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., Imotani,K., 
Ishii,Y., Ito,M., Kawai,J., Kojima,Y., Konno,H., Kouda,M., 
Matsuyama,T. , Nakamura,M., Nishi,K., Nomura, K., Numasaki,R., 
Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., Sakazume, N . , 
Sasaki, D . , Sato,K., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagawa,A. , Takahashi , F. , Takaku-Akahira, S . , 
Tanaka,T., Tomaru,A. , Toya,T., Watahiki,A. , Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y. 

RIKEN Encyclopedia of Mouse Full-length cDNAs (Akimura,T., et al . 

2001) 

Unpublished (2001) 

Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 

Sciences Center (GSC), Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome-res@gsc . riken . go . jp, 

URL: http : / / genome . gsc. riken . go . jp/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fuj iwake, S . , Inoue,K., Togawa,Y., Izawa,M., Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. 



FEATURES 

source 



and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y. , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki , Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 

Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details. 

e mouse tissues. 

Location/Qualifiers 
1. .396 

/organism="Mus mus cuius" 
/mol_type="mRNA" 
/strain="C57BL/6J" 
/db_xref="taxon: 10090" 
/ cl one= " F4 3 0 1 1 3M1 6 " 
/tissue_type=" spleen" 
/dev_stage="6 days neonate" 

/clone_lib="RIKEN full-length enriched, 6 days neonate 
spleen" 



ORIGIN 



Query Match 24.7%; 
Best Local Similarity 99.7%; 
Matches 381; Conservative 



Score 380.4; DB 10; 
Pred. No. 8.2e-81; 
0; Mismatches 1; 



Length 396; 
Indels 0; 



Gaps 



0; 



Qy 1162 AG CT CTCTGTTCT GAT T T TAG GT TAT AC C C AGAGT AT GGAAAAAATAAGG C AT GAGAAAG 1221 

I I I I I I I I I I I I I M I I I I I I I II I II I ! I I I I I I I i I I II I I I I II I I I I I I I I I I I I I 
Db 1 AGC T CT C T GT T CT GATT T T AGGT TAT AC C CAGAGT AT GGAAAAAAT AAGGC AT GAGAAAG 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1222 CAT T GACAT CT T C ACTTAAGAACT GAACAAAAGAGAAC AAAT AT T GT C AAT GT T T GG AC A 1281 
I I I I I I I I I I I M I I I I I I I I I I I II I II I I I I I I M I I I II I I I I I I I I I I I II I I I I I 
61 CATT GACATCTT CACTTAAGAACT GAACAAAAGAGAACAAAT ATT GT CAAT GTTT GGACA 120 

1282 C T T AGGAT CT GAAAT CT T GGAAAT T TTAAGAC CT C T T TT T CT AT C AGT GT AAAAGGAAT A 1341 
I I I I I I I I I I I M I II I I I I I I I I I I II I I I I II I II I II I II II I I I I I M I II II I I I 
121 CT T AG GAT CT GAAAT CT T GGAAAT T TTAAGAC CTCTTTTT CT AT C AGT GT AAAAG GAAT A 180 

1342 C AAGATAGCT AGT T G CAAAT GCT GAAT G CAT T T CAT CAT T G GT C AGGT C G ATAAGC GT GT 1401 
M I I I I I I II I I I I I I I I I I I I I II I I I I I || I I I I M I I I I II I I I I I II I I I II I I I I 
181 CAAGAT AG CT AGT T GCAAAT GC T GAAT GC AT T T CAT CAT T GGT C AGGTC GAT AAGC GT GT 240 

1402 TT C T GAAAT AGT CT T AT T TT T AT T C T T GT AAT AT T AAAAT T TAT GT GAAAAAT GAAT AT A 1461 
M I I I I I I M I M I I I I I I I I I II II I I I I I I I II I I II I I I I M I I I I I I I I I I I I I I I 
241 TT CT GAAAT AGT C T TAT T T T TAT T CT T GT AAT AT TAAAAT T TAT GT GAAAAAT GAAT AT A 300 

1462 AT T CAAT GTACAAC AT T AGAT T T T C TAT T T GAAAAT TAT AT T T C T T GAAAAAATAACT GC 1521 
I I I I M I I I I I I I I I I I II I I I I I I M I I I || | | | | | | | | | | | | | | | | | | | | M I I I I I 
301 AT T CAAT T T ACAAC AT T AGAT T T T CTATT T GAAAAT TAT AT T T CTT GAAAAAATAACT GC 360 



Qy 1522 T GT G C CT AAAT AAAT CAAT ATA 1543 

I I I I I I I I I I II II I I I I I I I I 
Db 361 T GT GC CTAAATAAAT CAAT AT A 382 



RESULT 12 
AI649254/c 

LOCUS AI649254 367 bp mRNA linear EST 30-APR-1999 

DEFINITION uk27cl0.xl Sugano mouse kidney mkia Mus musculus cDNA clone 

IMAGE: 1970226 3', mRNA sequence. 
ACCESSION AI649254 
VERSION AI649254.1 GI: 4730088 

KEYWORDS EST. 

SOURCE Mus musculus (house mouse) 

ORGANISM Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
REFERENCE 1 (bases 1 to 367) 

AUTHORS Marra,M., Hillier,L., Kucaba,T., Martin, J., Beck,C, Wylie,T., 

Underwood, K. , Steptoe,M., Theising,B., Allen, M. , Bowers, Y. , 

Person, B., Swaller,T., Gibbons, M., Pape,D., Harvey, N., Schurk,R., 

Ritter,E., Kohn,S., Shin,T., Jackson, Y., Cardenas, M. , McCann,R., 

Waterston,R. and Wilson, R. 
TITLE The WashU-NCI Mouse EST Project 1999 

JOURNAL Unpublished (1999) 
COMMENT Other_ESTs: uk27cl0.yl 

Contact: Marra M/WashU-NCI Mouse EST Project 1999 

Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, USA 
Tel: 314 286 1800 
Fax: 314 286 1810 

Email: mouseest@watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
MGI: 986966 

This clone was previously sequenced on the 5 T end only, this new 
data is from the 3 1 end 
Seq primer: custom primer used 
High quality sequence stop: 353. 
FEATURES Location/Qualifiers 
source 1. .367 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL" 

/db_xref="taxon: 10090" 

/clone=" IMAGE: 1970226" 

/ sex="f emale" 

/ dev_stage="adult" 

/lab_host="DH10B" 

/clone_lib="Sugano mouse kidney mkia" 

/note="Organ: kidney; Vector: pME18S-FL3; Site_l: Drain 
(CACTGTGTG); Site_2 : Dralll (CACCATGTG) ; 1st strand cDNA 
was primed with an oligo(dT) primer 

[ATGTGGCCTTTTTTTTTTTTTTTTT] ; double- stranded cDNA was 
ligated to a Dralll adaptor [TGTTGGCCTACTGG] , digested 
and cloned into distinct Dralll sites of the pME18S-FL3 
vector (5 r site CACTGTGTG, 3 1 site CACCATGTG). Xhol should 
be used to isolate the cDNA insert. Size selection was 
performed to exclude fragments <1.5kb. Library 
constructed by Dr. Sumio Sugano (University of Tokyo 
Institute of Medical Science) . Custom primers for 



sequencing: 5 ! end primer CTTCTGCTCTAAAAGCTGCG and 3' end 
primer C GAC CT G C AGC T C GAGCAC A . " 

ORIGIN 

Query Match 23.6%; Score 363.8; DB 9; Length 367; 

Best Local Similarity 99.5%; Pred. No. 8.2e-77; 

Matches 365; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 


i n ^ r 


CAC T T GAT AAAC AGT G CT GT GC AGT T G AGT T T TAACT AAGT AAAC C AC C AT T T C T AG GC T 


1094 






1 1 1 1 1 I 1 1 1 M II 1 1 1 1 1 1 1 II 1 1 1 1 | | 1 1 II 1 1 1 1 1 1 1 1 | M 1 1 1 1 1 1 I I 1 1 I I I Ml 




Db 


367 


CACTTGATAAACAGTGCTGTGCAGTTGAGTTTTAACTAAGTAAACCACCATTTCTACGCT 


308 


Qy 


i not; 

i u y d 


TT AGC T T T C CAC CAT C CT C CAAC C C C C AGGG C T G GAGT ACAAGC T GGGT C CAC AT GAAT C 


1154 






M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 | || I | | 




Db 


307 


TTAGCTTTC CAC CAT CCTCCAACCCCCAGGGCTGGAGTAC7VAGCT GGGT CCACATGAATC 


248 


Qy 


11 J J 


AGAAGG C AGC T CTCTGTTCT GAT T T T AGGTT AT AC C C AGAGT AT GGAAAAAAT AAGGCAT 


1214 






1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 I I I I I I I I I I I I 




Db 


247 


AGAAG GC AGCT CT C T GT T CT GAT T T T AG GTT AT AC C CAGAGT AT G GAAAAAATAAG GC AT 


188 


Qy 


1215 


GAGAAAGC AT T GAC AT CT T CACT T AAGAACT GAACAAAAGAGAACAAAT AT T GT CAAT GT 


1274 






1 1 M 1 1 i 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | M 1 1 1 1 1 1 1 1 II M 1 II II 




Db 


187 


GAGAAAG CAT T GACAT CT T CAC T TAAGAT CT GAACAAAAGAGAACAAAT AT T GT CAAT GT 


128 


Qy 


1275 


T T GGAC ACT T AG GAT CT GAAAT CT T GGAAAT TT T AAGAC CTCTTTTT CT AT C AGT GT AAA 


1334 






1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I || I I || I I I I 




Db 


127 


T T GGAC AC T T AGGAT CT GAAAT CT T GGAAAT TT TAAGAC CT CT T TTT C TAT C AGT GT AAA 


68 


Qy 


1335 


AG GAAT ACAAGAT AGC T AGT T GCAAAT G CT GAAT GCATT T CAT CAT T GGT C AGGT C GAT A 


1394 






1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


67 


AGGAATACAAGATAGCTAGTTGCAAAT GCT GAATGCATTTCATCATTGGT CAGGTCGATA 


8 


Qy 


1395 


AGCGTGT 1401 





Db 7 AGCGTGT 1 



RESULT 13 

BB645274 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



BB645274 



63 6 bp 



mRNA 



linear 



EST 31-AUG-2001 



days neonate male adipose 
mRNA sequence. 



BB645274 RIKEN full-length enriched, 4 
Mus musculus cDNA clone B430012O21 5' 
BB645274 

BB645274.1 GI: 15402306 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 636) 

Arakawa,T., Carninci,P., Fukuda,S., Furuno,M., Hanagaki,T., 
Hara,A., Hiramoto,K., Hori,F., Ishii,Y., Ito,M., Kawai,J., 
Konno,H., Kouda,M., Koya,S., Matsuyama, T . , Miyazaki,A. , Nomura, K . , 
Ohno,M., Okazaki,Y., Okido,T., Saito,R., Sakai,C. f Sakai,K., 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagami,M., Tagawa,A. , Takahashi , F. , 
Takeda,Y., Tanaka,T., Toya,T., Muramatsu,M. and Hayashizaki, Y. 



TITLE RIKEN Mouse ESTs (Arakawa,T., et al . 2001) 

JOURNAL Unpublished (2001) 
COMMENT Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 

Sciences Center ( GSC) , Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome-res @gs c . riken . go . jp, 

URL : http : / / genome . gs c . r i ken .go.jp/ 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi , Y. , Shibata,K., Itoh,M., Carninci,P., 
Sugahara f Y. and Hayashizaki , Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 

Yamanaka,I., Kiyosawa,H., Kondo,S., Saito,T., Shinagawa, A. , 
Aizawa,K., Fukuda,S., Hara,A., Itoh,M., Kawai,J., Shibata,K., 
Arakawa,T., Ishii,Y. and Hayashizaki, Y. 

Mapping of 19032 mouse cDNAs on mouse chromosomes. J. Struct. 
Func. Genomics 2 pre, L72-L86 (2001 

Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details, 
e mouse tissues. 
FEATURES Location/Qualifiers 
source 1. .636 

/organism="Mus mus cuius" 
/mol_type="mRNA" 
/db_xref="taxon: 10090" 
/clone="B4 30012O21" 
/sex="male" 

/ tissue_type="adipose" 
/dev__stage="4 days neonate" 
/lab_host="DH10B" 

/clone_lib="RIKEN full-length enriched, 4 days neonate 
male adipose" 

/note="Site_l: Sail; Site_2 : BamHI; cDNA library was 
prepared and sequenced in Mouse Genome Encyclopedia 
Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in 
RIKEN. Division of Experimental Animal Research in Riken 
contributed to prepare mouse tissues. 1st strand cDNA was 
primed with a primer [5 1 

GAGAGAG AGAAGGAT C CAAGAGC T CTTTTTTTTTTTTTTT T VN 3'], cDNA was 
prepared by using trehalose thermo-activated reverse 



transcriptase and subsequently enriched for full-length by 
cap-trapper. cDNA went through one round of normalization 
to Rot = 10.0 and subtraction to Rot = 229,0. Second 
strand cDNA was prepared with the primer adapter of 
sequence [5 1 GAGAGAGAGATT CT CGAGTTAATTAAAT TAAT CCC C CCCCCCCCC 
3']. cDNA was cleaved with Xhol and BamHI . Vector: a 
modified pBluescript KS ( + ) after bulk excision from Lambda 
FLC I . " 

ORIGIN 

Query Match 23.2%; Score 357.6; DB 10; Length 636; 

Best Local Similarity 91.7%; Pred. No. 2.7e-75; 

Matches 389; Conservative 0; Mismatches 34; Indels 1; Gaps 1; 

Qy 1 G CT C CT GGC AGAGT T T T C T GT C GAGAC AGAAGC C GAC AGCAGAAT GGC AC AGAAT T TAT C 60 

I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I II I I I I I II M I i I I I I I II I I I I I 
Db 20 GCT C CT GG C AGAGT T T T C T GT C GAGAC AGAAG C C GAC AGCAGAAT GGC AC AGAAT T TAT C 79 

Qy 61 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 120 

I I I I I I I I I I I I I I I II I I I I II I I II I I I I I I I I I I I I I II I I I I I I II I I I MINI 
Db 80 T T GT GAGAAT T GGT T GGCAACAGAGG CT AT CT T GAATAAGT ACT AC C T CT C T G GATT T T A 139 

Qy 121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II MINI II Mill II 

Db 140 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGGGGTGTTTGGGTACCTGTT 199 

Qy 181 CT G CAT GAAGAACT G GAACAGC AG CAAT GT CT AT CT T TT TAAC CTT T C CAT CT C T GACT T 240 

I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 00 CTGCAT GAAGAACT GGAACAGCAGCAATGTCTATCTTTTTAACCTTTCCATCTCTGACTT 25 9 

Qy 241 TGCTTTCCT GT GCAC C CT T C C CAT C CT GAT AAAGAGT T AT G C CAAT GAT AAGGGGAC C T A 3 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || I I M | I | | | | 
Db 2 60 TGCTTTCCTGGGCACCCTTCCCATCCTGATAAAGAGTTTTGCCAATGATAAGGGGACCTA 319 

Qy 301 T GGAGAT GT T CT C T GT AT AAG CAAC CGAT AT GT G CT T C AC AC CAAC CT C T AC AC C - AG CA 359 

I I I I I I I I I I I I II I I I I I I I I I I I I I II I I II I I I I I I I I I I I II I I I I I 
Db 32 0 T GG AGAT GT T CTT T GGATAAGCAAC C GAT AT GGGC T TAAC AC CAAC CTT TAAT C CAAGCT 379 

Qy 360 TCCTCTTCCT C AC TT T CAT T AGC AT GGAC C GAT AT C T GCT CAT GAAGT AC C CTTT C C GAG 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II II I 

Db 38 0 T ACT T T T ACT T ACTT T AT T T AGCAT GGAC C GAT AT T T GT TT AT GAAAGT GCCCTTTTTCG 439 

Qy 420 AACA 423 

I I I 

Db 440 AAAA 443 



RESULT 14 

BB846608 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BB846608 416 bp mRNA linear EST 26-NOV-2001 

BB846608 RIKEN full-length enriched, adult male kidney Mus musculus 
cDNA clone F530003I24 5', mRNA sequence. 
BB846608 

BB8 46608 . 1 GI: 17 084 983 
EST. 

Mus musculus (house mouse) 
Mus musculus 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus . 
REFERENCE 1 (bases 1. to 416) 

AUTHORS Akimura,T. , Arakawa,T., Carninci,P., Furuno, M. , Hanagaki,T., 
Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., Imotani,K., 
Ishii,Y., Ito,M., Kawai,J., Kojima,Y., Konno,H., Kouda,M., 
Matsuyama,T. , Nakamura,M., Nishi,K., Nomura, K. , Numasaki, R. , 
Okazaki,Y., Okido,T., Saito,R., Sakai,C, Sakai,K., Sakazume,N., 
Sasaki, D., Sato,K., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Suzuki, H., Tagawa,A., Takahashi , F. , Takaku-Akahira, S . , 
Tanaka,T., Tomaru,A. , Toya,T., Watahiki,A. , Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y . 

TITLE RIKEN Encyclopedia of Mouse Full-length cDNAs (Akimura,T., et al. 

2001) 

JOURNAL Unpublished (2001) 
COMMENT Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 

Sciences Center (GSC), Yokohama Institute 

The Institute of Physical and Chemical Research (RIKEN) 

1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 

Tel: 81-45-503-9222 

Fax: 81-45-503-9216 

Email : genome-res@gsc . riken. go . jp, 

URL:http: //genome . gsc . riken. go. jp/ 

Carninci,P., Shibata,Y., Hayatsu,N. f Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. . 10 (10), 1617-1630 (2000) 

wagi,K., Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M. , Ohara,E., 
Watahiki,M., Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., 
Matsuura,S., Kawai,J., Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. 
and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 
10 (11), 1757-1771 (2000) 

Konno,H., Fukunishi, Y . , Shibata,K., Itoh,M., Carninci,P., 
Sugahara,Y. and Hayashizaki, Y. 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. . 11 (2), 281-289 (2001) 
Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details . 
e mouse tissues. 
FEATURES Location/Qualifiers 
source 1. .416 

/organism="Mus musculus" 
/mol_type="mRNA" 
/db_xref="taxon: 10090" 
/clone="F530003I24" 
/sex="male" 
/tissue_type=" kidney" 
/dev_stage="adult" 
/lab_host="SOLR" 

/clone_lib="RIKEN full-length enriched, adult male kidney" 
/note="Site_l: Xhol; Site_2 : SstI; cDNA library was 
prepared and sequenced in Mouse Genome Encyclopedia 



Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in 
RIKEN. Division of Experimental Animal Research in Riken 
contributed to prepare mouse tissues. 1st strand cDNA was 
primed with a primer [5 1 

GAGAGAGAGAGCGGCCGCAACTCGAGTTTTTTTTTTTTTTTTVN 3 ' ] , cDNA was 
prepared by using trehalose thermo-activated reverse 
transcriptase and subsequently enriched for full-length by 
cap-trapper. Second strand cDNA was prepared with the 
primer adapter of sequence [5 ! 

GAGAGAGAGAAGGAT C C AAGAG C T CAAT TAAT T AAT T AAAC CCCCCCCCCC 3 1 ] . 
cDNA was cleaved with Xhol and Sstl. " 



ORIGIN 



Query Match 23.0%; 
Best Local Similarity 97.3%; 
Matches 392; Conservative 



Score 354.2; DB 10; 
Pred. No. 1.7e-74; 
0; Mismatches 8; 



Length 416; 
Indels 3; 



Gaps 



3; 



Qy 


1 


Db 


16 


Qy 


61 


Db 


76 


Qy 


121 


Db 


136 


Qy 


181 


Db 


196 


Qy 


241 


Db 


255 


Qy 


301 


Db 


314 


Qy 


361 


Db 


374 



GC T C C T GGCAGAGT TTTCTGTC GAGAC AGAAG C C GACAG C AGAAT GG C AC AGAATTT AT C 60 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GCTCCTGG C AGAGT T T T C T GT C GAGAC AGAAGC C GAAAG C AGAAT GGC AC AGAATT TAT C 7 5 
T T GT GAGAAT T G GT T GG CAAC AGAG GC T AT CTT GAATAAGT AC T AC CT CT C T GC AT TT T A 12 0 

I I M I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I II I I I I I I I 

T T GT GAGAAT T GGT T G G CAAC AGAGGCT AT C TT GAATAAGT ACT AC CT C T C T GC AT TT TA 135 

TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 180 
I I I I I M I II I II I I I I I I I I I I I || | | | | | | | | | || || | | | || | | | | M I I I I II I II 
TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTTGGCTACCTCTT 195 

CTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCTTTCCATCTCTGACTT 24 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | I I I I I I I I I I I I 
CTGCATGAAGAACTGGAAAAGCAGCAATGTCTATCTTTTTAAACTTT-CATCTCTGACTT 254 

T GCTTT CCTGTGCACC CT T C C CAT CCT GATAAAGAGTTAT GCCAAT GATAAGGGGAC CTA 300 
I I I I M I I I I I I M I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I 
TGCTTTCCTGTG CAC C C T T - C CAT C CT GATAAAGAGT TAT G C CAAT GAT AAGGG GAC CT A 313 

T GGAGAT GTT CT CT GT ATAAGCAACCGATAT GTGCTT CACACCAACCT CTACACCAGCAT 360 

I I M I I I I I I I I I I I I I I I I I I I II II I I I I I Mill I I I I I I I I 

T GGAGAT GT T CT AT GT ATAAG CAAC C GAT AT GT GGTT C ACAACAAC C T C T AAAC C AG CAT 373 

C CT C T T C CT CAC T T T CAT TAG- CAT GGAC C GAT AT CT GCT CAT 4 02 
I M M I II I I I I I I I I I I I I I I I I I I I || I | | | | | | | | | | || 
CCTCTTCCT CAC TT T CAT TAG C CAT G GAC C GAT AT CT GCT CAT 416 



RESULT 15 

BY368584 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BY368584 408 bp mRNA linear EST 12-DEC-2002 

BY368584 RIKEN full-length enriched, 6 days neonate spleen Mus 
mus cuius cDNA clone F4 30110C01 3', mRNA sequence. 
BY368584 

BY36858 4 .1 GI : 26598 072 
EST. 

Mus musculus (house mouse) 
Mus musculus 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 
REFERENCE 1 (bases 1 to 4 08) 

AUTHORS Okazaki,Y., Furuno,M. , Kasukawa,T., Adachi,J., Bono f H., Kondo,S., 
Nikaido,I., Osato,N., Saito,R., Suzuki, H., Yamanaka,I., 
Kiyosawa,H., Yagi,K., Tomaru,Y., Hasegawa,Y., Nogami,A., 
Schonbach,C. , Goj obori , T . , Baldarelli, R. , Hill, D. P., Bult,C, 
Hume, D. A., Quackenbush, J . , Schriml , L . M. , Kanapin,A. , Matsuda, H . , 
Batalov, S., Beisel,K.W., Blake, J. A., Bradt,D., Brusic,V., 
Chothia,C, Corbani, L . E . , Cousins, S., Dalla,E., Dragani, T . A. , 
Fletcher, C. F. , Forrest, A. , Frazer, K. S . , Gaasterland, T . , 
Gariboldi,M. , Gissi,C, Godzik,A., Gough,J., Grimmond, S , , 
Gustincich, S . , Hirokawa,N., Jackson, I . J. , Jarvis,E.D., Kanai,A., 
Kawaji,H., Kawasawa,Y., Kedzierski , R. M. , King,B.L., Konagaya,A., 
Kurochkin, I . V. , Lee,Y., Lenhard,B. , Lyons, P. A., Maglott , D . R . , 
Maltais,L., Marchionni, L. , McKenzie,L., Miki,H., Nagashima, T . , 
Numata,K., Okido,T., Pavan,W.J., Pertea,G., Pesole,G., 
Petrovsky,N. , Pillai,R., Pontius, J. U. , Qi, D. , Ramachandran, S . , 
Ravasi,T., Reed, J. C , , Reed, D. J., Reid,J., Ring,B.Z., Ringwald,M., 
Sandelin,A., Schneider, C . , Semple,C.A., Setou,M., Shimada, K . , 
Sultana, R. , Takenaka,Y., Taylor,M.S., Teasdale, R. D . , Tomita,M,, 
Verardo,R., Wagner, L., Wahlestedt , C . , Wang,Y., Watanabe,Y., 
Wells, C, Wilming,L.G., Wynshaw-Boris , A. , Yanagisawa, M. , Yang, I., 
Yang,L., Yuan, Z . , Zavolan,M., Zhu,Y., Zimmer,A., Carninci,P., 
Hayatsu,N. , Hirozane-Kishikawa, T . , Konno,H. , Nakamura,M. , 
Sakazume,N., Sato,K., Shiraki,T., Waki,K., Kawai,J., Aizawa,K., 
Arakawa,T., Fukuda,S., Hara,A. , Hashizume, W. , Imotani,K., Ishii,Y., 
Itoh,M., Kagawa,I., Miyazaki,A. , Sakai,K., Sasaki, D . , Shibata,K., 
Shinagawa, A. , Yasunishi, A. , Yoshino,M., Waterston, R. , Lander, E.S., 
Rogers, J., Birney,E. and Hayashizaki, Y. 
TITLE Analysis of the mouse transcriptome based on functional annotation 

of 60,770 full-length cDNAs 
JOURNAL Nature 420, 563-573 (2002) 
MEDLINE 22354683 
PUBMED 12466851 
COMMENT Contact: Yoshihide Hayashizaki 

Laboratory for Genome Exploration Research Group, RIKEN Genomic 
Sciences Center (GSC), Yokohama Institute 
The Institute of Physical and Chemical Research (RIKEN) 
1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 
Tel: 81-45-503-9222 
Fax: 81-45-503-9216 
Email : genome-res@gsc . riken . go . jp, 
URL : http : / / genome . gsc . riken . go . jp/ 
Aizawa,K., Akimura,T., Arakawa,T. 
Hirozane,T., Imotani,K., Ishii,Y. 
Miyazaki,A., Murata,M., Nakamura,M., Nomura, K., Numazaki,R., 
Ohno,M., Sakai,K., Sakazume,N., Sasaki, D., Sato,K., Shibata,K., 
Shiraki,T., Tagami,M., Waki,K., Watahiki,A., Muramatsu,M. and 
Hayashizaki, Y. Direct Submission 

Computational Analysis of Full-Length Mouse cDNAs Compared with 
Human Genome Sequences Mamm. Genome. 12, 673-677 (2001) 

Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new 
genes. Genome Res. 10 (10), 1617-1630 (2000) 

RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer. Genome Res. 



Carninci,P., Fukuda,S., 
Itoh,M., Kawai,J., Konno,H. 



FEATURES 

source 



10 (11), 1757-1771 (2000) 

Computer-based methods for the mouse full-length cDNA 
encyclopedia: real-time sequence clustering for construction of a 
nonredundant cDNA library. Genome Res. 11 (2), 281-289 (2001) 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site (http://genome.gsc.riken.go.jp) for 
further details. 

Location/ Qualifiers 
1. .408 

/organism="Mus mus cuius" 
/mol_type="mRNA" 
/strain="C57BL/6J" 
/db_xref="taxon: 10090" 
/clone="F430110C01" 
/tissue_type=" spleen" 
/dev_stage="6 days neonate" 

/clone_lib="RIKEN full-length enriched, 6 days neonate 
spleen" 



ORIGIN 



Query Match 22.7%; Score 350.6; DB 13; Length 408; 

Best Local Similarity 98.0%; Pred. No. 1.3e-73; 

Matches 386; Conservative 0; Mismatches 5; Indels 3; 



Gaps 



3; 



Qy 


1153 


Db 


1 


Qy 


1213 


Db 


61 


Qy 


1271 


Db 


121 


Qy 


1331 


Db 


181 


Qy 


1390 


Db 


241 


Qy 


1450 


Db 


301 


Qy 


1510 


Db 


361 



T C AGAAG G C AGCT CT CT GTT C T GAT T T TAG GT TAT AC C C AGAGT AT GGAAAAAATAAG GC 1212 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I II II I I I 

T CAGAAGC CAG CT CT C T GTT CT GAT TT T AGGT TAT AC C C AGAGT AT GNAAAAAAT AAGG C 6 0 
AT GAGAAAGC A- T T GAC AT CT T C AC T T AAGAACT GAACAAAAGAGAACAAAT A- TT GT CA 127 0 

I I I I I I I I I I I I I I I I I I I I I I I I II II I I I II I I I I I I I I I M I I I II II I II Ml 

AT GAGAAAGC AGT T GAC AT CT T C AC TT AAGAACT GAACAAAAGAGAACAAAT AGTT CT C A 120 
AT GT T T GGAC ACT T AGGAT CT GAAAT CTT GGAAAT T TTAAGAC CT CT TT T T CTAT CAGT G 1330 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I II I I I I I I I I I I I I I I 

AT GT T T GGAC AC T TAG GAT C T GAAAT CTT GGAAAT TTTAAGAC CT CTT TT T C TAT CAGT G 180 
T AAAAG GAAT ACAAGAT AG- C T AGT T GCAAAT G CT GAAT G CATT T CAT CAT T G GT CAGGT 1389 

I I I I I I I II I I II II I I I I I I I I I I I I I I I I I I I I I I I M | | | M I II I II I I I I I II 

T AAAAGGAAT AC AAGAT AG C CT AGT T C CAAAT G CT GAAT GCAT T T CAT CAT T GGT CAG GT 24 0 

CGATTVAGCGTGTTTCTGAAATAGTCTTATTTTTATTCTTGTAATATTAAAATTTATGTGA 1449 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I 

C GAT AAGC GT GTT T C T GAAAT AGT C T TAT T T T TAT T CT T GTAAT AT TAAAAT T TAT GT GA 300 

AAAAT GAATATAATT CAAT GTACAACATT AGATTTT CTATTT GAAAATTATATTTCTT GA 1509 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I I I 

AAAAT GAAT AT AAT T CAAT T T ACAAC AT T AGAT T T T C TAT T T GAAAAT TAT AT T T C T T GA 360 

AAAAAT AACT GCT GT GCCT AAAT AAAT CAAT AT A 1543 
I I I I I I I I I I I I I I I 1 I II I I I I I I I I I I I I I I I 



Search completed: August 24, 2004, 16:03:07 
Job time : 4316 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table : 



August 24, 2004, 08:45:05 ; Search time 6253 Seconds 

(without alignments) 
10695.394 Million cell updates/sec 

US-09-891-138A-1 
1543 

1 gctcctggcagagttttctg tgcctaaataaatcaatata 1543 

IDENTTTY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 3470272 seqs, 21671516995 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



6940544 



Database 



GenEmbl : 



1 




gb ba : * 




2 




gb htg: 




3 




gb in : * 




4 




gb om : * 




5 




gb ov : * 




6 




gb_pat : 




7 




gb ph:* 




8 




gbjpl : * 




9 




gb_pr : * 




10: 


gb_ro : 


* 


11: 


gb sts 


: * 


12: 


gb sy : 


* 


13: 


gb un : 


* 


14 : 


gb vi : 




15: 


em ba: 




16: 


em fun 


: * 


17 


em hum 


: * 


18 


em in: 




19 


em mu: 




20 


em om: 




21 


em or: 


* 


22 


em ov: 




23 


em_pat 


: * 


24 


em_ph : 




25 


em pi : 




26 


: em ro: 


* 


27 


: em sts 


: * 



28 


em 


un: * 


29 


em 


vi : * 


30 


em 


_htg_hum: * 


31 


em 


_htg_inv: * 


32 


em 


_htg_other : * 


33 


em 


htg mus : * 


34 


em 


_htg_pln: * 


35 


em 


_htg__rod: * 


36 


em 


_htg_mam: * 


37 


em 


htg vrt:* 


38 


em 


sy : * 


39 


em 


htgo hum:* 


40 


em 


htgo mus : * 


41 


em 


htgo other:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 
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Result Query 

No. Score Match Length DB ID Description 
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38. 


4 


1449 


9 


BC030948 


BC030948 


Homo sapi 


13 


592. 


4 


38. 
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38. 


3 
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38. 


3 


132745 


9 
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38. 
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8 
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8 
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3 
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6 
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20 
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6 


9. 


6 
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5 
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21 
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2 


9. 


4 
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2 


AC101335 
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22 


133. 


8 


8. 


7 
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6 
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23 


126. 
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8. 
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24 


126. 


6 


8. 
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6 


BD184762 


BD184762 
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25 


126. 


6 


8. 


2 
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26 


126. 


6 


8. 
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1014 
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27 


126. 


6 


8. 
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1014 


6 
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28 


126. 
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126. 
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AX593341 


Sequence 
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126. 
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126. 


6 


8. 
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34 
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35 
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.2 
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37 


126. 
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.2 
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38 


126. 
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8 , 


.2 


1414 


9 
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39 


126. 
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8 , 


. 2 


1729 


6 
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40 


126. 


.6 


8 . 


. 2 


1797 


6 


AX593340 
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Sequence 


41 


126, 


.6 


8. 


.2 


9905 


6 


AX379470 


AX379470 


Sequence 


c 42 


126. 


.6 


8, 


.2 


67645 


9 


AL356486 


AL356486 


Human DNA 


43 


126. 


.6 


8. 


.2 


156555 


9 


AC026756 
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Homo sapi 


44 


125. 


.6 


8, 


.1 
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6 
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AX657422 
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45 
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8, 


.1 
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6 
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ALIGNMENTS 



RESULT 1 
AX376573 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



CDS 



AX376573 1543 bp DNA linear PAT 01-MAR-2002 

Sequence 1 from Patent WO0200719. 

AX376573 

AX376573. 1 GI: 19170674 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Lin,D.C, Zhao, J., Chen, J. L. and Cutler, G. 
Novel receptors 

Patent: WO 0200719-A 1 03-JAN-2002; 
Tularik Inc. (US) 

Location/Qualifiers 

1. .1543 

/organism="Mus musculus" 
/mol_type= ,f unassigned DNA" 
/db_xref="taxon: 10090" 
44. .997 

/note="unnamed protein product; mouse TGR18 G-protein 

coupled receptor (GPCR) " 

/ codon_start=l 

/protein_id="CAD26816. 1" 

/db_xref="GI : 19170675" 

/ db_x r e f = " REMT REMB L : C AD 2 6 8 1 6 " 

/translation="MAQNLSCENWLATEAILNKYYLSAFYAIEFIFGLLGNVTWFGY 
LFCMKNWNSSNVYLFNLSISDFAFLCTLPILIKSYANDKGTYGDVLCISNRYVLHTNL 
YTS I LFLTFI SMDRYLLMKYP FREHFLQKKEFAI LI SLAWALVTLEVXPMLTFINSV 
PKEEGSNCIDYASSGNPEHNLIYSLCLTLLGFLIPLSVMCFFYYKMWFLKRRSQQQA 
TALPLDKPQRLWLAWIFSILFTPYHIMRNLRIASRLDSWPQGCTQKAIKSIYTLTR 
PLAFLNSAINPI FYFLMGDHYREMLI SKFRQYFKSLTSFRT " 



ORIGIN 



Query Match 100.0%; Score 1543; DB 6; 

Best Local Similarity 100.0%; Pred. No. 0; 
Matches 1543; Conservative 0; Mismatches 0; 



Length 1543; 
Indels 0; Gaps 



0; 



Qy 1 GC T C CT GGC AGAGT TTTCTGTC GAGAC AG AAG CC GACAG C AGAAT G GC AC AGAAT T TAT C 60 

I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 GC T C CT GGC AGAGT TTTCTGTC GAGAC AG AAG CC GACAG C AGAAT G GC AC AGAAT T TAT C 60 

Qy 61 TTGTGAGAATTGGTTGGC7\ACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 12 0 

I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I I I I I II I I I II I I 
Db 61 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 120 

Qy 121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 180 

Qy 181 CT G CAT GAAGAACT G GAAC AGC AGCAAT GT CT AT CT T T T T AAC C T T TC CAT CT CT GACT T 24 0 

I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I 
Db 181 CTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCTTTCCATCTCTGACTT 24 0 

Qy 241 TGCTTTCCTGTG C AC C CT T C C CAT C CT GAT AAAGAGT T AT GC CAAT GATAAG GGGAC CT A 30 0 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 241 TGCTTTCCTGTG C AC C CT T C C CAT C CT GAT AAAGAGT T AT GC CAAT GATAAG G G GAC CT A 300 

Qy 301 T GGAGAT GT T CT CT GT AT AAGCAAC C GAT AT GT GCT T C AC AC CAAC CT CT AC AC CAGCAT 360 

I I I I I I I I I I I I I I I I II I II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 T GGAGATGT T C T CT GT AT AAGCAACC GAT AT GT GCT T C AC AC CAAC CT CT AC AC CAGCAT 360 

Qy 361 CCTCTTCCT C ACTTT C AT TAG CAT GGAC C GAT AT CT GCT CAT GAAGT AC C CTT T C C GAGA 42 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 361 CCTCTTCCT C ACT TT C AT TAG CAT GGAC C GAT AT CT GCT CAT GAAGT AC C CT T T C C GAGA 42 0 

Qy 421 ACACT T T CT ACAAAAGAAGGAAT T T G C CAT T T TAAT CTCGCTGGCT GT C T GG GC CT T AGT 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I II I I I I I I I I I I I 
Db 421 ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 4 80 

Qy 481 GAC CTT AGAAGT T CT AC C CAT GCT CACT T T CAT CAATT CT GT C C CAAAAGAAGAGGG C AG 54 0 

I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I II 
Db 481 GAC C T T AGAAGT T CT AC C CAT GCT C ACT T T CAT CAATT CT GT C C CAAAAGAAGAGGGCAG 54 0 

Qy 541 T AACT GC AT C GACT AT GCAAGT T CT GGAAAC C CT GAAC ACAAT CT CAT T T AC AGC CT CT G 600 

I I I I I I I I I I I I I I I I II I I I II I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I II I I 
Db 541 T AACT GCAT C GACT AT G CAAGT T C T GGAAAC C CT GAAC ACAAT CT CAT T T AC AGC CT CT G 600 

Qy 601 CCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGAT 660 

I I I I I II M I I I I I I I I I I I I I 1 I I I I I I I I I I II II I I I I I I I I I I I I II I I I I I I I I I 
Db 601 CCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGAT 660 

Qy 661 GGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACC 720 

I II I I I I I I I II I I I I I I II I I I I I I I I I I M II I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 661 GGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACC 72 0 

Qy 721 CCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTTCACACCCTATCATAT 780 

I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M II I I I I I I M I I I I 
Db 721 CCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTTCACACCCTATCATAT 7 80 

Qy 7 81 CAT GC GCAAT T T GAGGAT C GC C T CACGC C T GGAT AGT T GG C C ACAAG GAT GT AC AC AGAA 84 0 

I I I I II I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 7 81 CAT G C G CAAT T T GAGGAT CG C CT CACGC CT G GAT AGT T G GC CACAAG GAT GT AC ACAGAA 840 



Qy 



8 41 GGCCATCA7VATCTATATACACACTGACACGGCCTCTGGCCTTTCTGAACAGTGCCATCAA 900 



1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 841 G G C CAT CAAAT CT AT AT ACAC ACT GAC AC GGCCTCTGGCCTTTCT GAAC AGT GC C AT CAA 90 0 

Qy 901 T C C CAT C T T CT AC T T C CT CAT G G GAGAC CAT T AC AGAGAGAT GC T GAT T AGT AAGT T C AG 960 

I I I I I I I I I I I I I I I M I I I II II I I I I I I I I I M I I I I I II I I I I I II I I I I I I I I I I I 
Db 901 T C C CAT C T T CT AC T T C CT CAT G G GAGAC CAT T ACAGAGAGAT GC T GAT T AGT AAGT T C AG 960 

Qy 961 ACAAT ACT T CAAGT C C CT T AC AT C CT T C AG GAC AT GAG CT G C T GGAT G C AG GT CT T C ACT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 ACAATACTTCAAGTCCCTTACATCCTTCAGGACATGAGCTGCTGGATGCAGGTCTTCACT 1020 

Qy 1021 CAG C CAAAAT GAGACACT T GAT AAAC AGT GCT GT GCAGT T GAGT T TTAACT AAGT AAAC C 1080 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 1021 CAG C CAAAAT GAGAC AC T T GAT AAAC AGT GCT GT GCAGT T GAGT T TT AAC T AAGT AAAC C 1080 

Qy 10 81 ACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTG 114 0 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I 
Db 10 81 ACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTG 114 0 

Qy 1141 GGT C C AC AT GAAT C AGAAGGC AGC T CTCTGTTCT GATT T T AGGT TAT AC C C AGAGT AT GG 12 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II M I I II I I I I I I I I I I I I 
Db 1141 GGT C C ACAT GAAT C AGAAGGC AGC T CT CT GT T CT GATT T T AGGT TAT AC C C AGAGTAT GG 1200 

Qy 1201 AAAAAATAAGGCAT GAGAAAGCAT T GACAT CTTCACTTAAGAACT GAACAAAAGAGAACA 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I ! I I I I I I I I I I I II I I I I I 
Db 1201 AAAAAATAAGGCAT GAGAAAGCAT T GACAT CTTCACTTAAGAACT GAACAAAAGAGAACA 1260 

Qy 12 61 AAT AT T GT CAAT GT TT G GACACT T AGGAT C T GAAAT CT T GGAAAT TT T AAGACC T CT T T T 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I II I I I I I I I I I 
Db 12 61 AAT ATT GT CAAT GT T T G GAC ACT TAG GAT CT GAAAT CT T GGAAAT TT TAAGAC C T CT T T T 132 0 

Qy 1321 T CT AT CAGT GT AAAAGGAAT ACAAGAT AGCT AGT T GCAAAT GCT GAAT GC AT TT CAT CAT 13 8 0 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I II I I I I I I 
Db 1321 TCTATCAGTGT AAAAGGAAT ACAAGAT AGCT AGT T GCAAAT GCT GAAT GC ATT T CAT CAT 1380 

Qy 1381 TGGTCAGGTCGATAAGCGTGTTTCTGAAATAGTCTTATTTTTATTCTTGTAATATTAAAA 14 4 0 

I I I I I I I I I II I I II I I I I I I M I I I I I I I I I II I I M I I I I I I I I II I I I I I I I I I I I I 
Db 1381 T GGT C AGGT CGATAAGC GT GT T T C T GAAAT AGT CTT AT T T T TAT T CT T GT AAT ATTAAAA 14 4 0 

Qy 1441 TTTAT GT GAAAAAT GAAT AT AAT T CAATGTACAACATTAGATTTT CTATTT GAAAATT AT 1500 

I I I I I II I I I I II I I I I I I I I II I I I M I I I I I I I I I I I I I I II I I I I I I t I I I I I I I I I 
Db 1441 TTTATGT GAAAAAT GAAT AT AATT CAATGTACAACATTAGATTTT CTATTT GAAAATTAT 1500 

Qy 1501 AT T T CT T GAAAAAAT AAC TGCTGTGCC T AAAT AAAT CAAT AT A 1543 

I I I II I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I II II I I 
Db 1501 AT T T CT T GAAAAAAT AACT GC T GT GC C T AAAT AAAT CAAT AT A 1543 



RESULT 2 
AF295367 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



AF295367 1598 bp mRNA linear ROD 06-APR-2001 

Mus musculus G-protein coupled receptor GPR91 mRNA, complete cds . 
AF295367 

AF295367. 1 GI: 12711490 

Mus musculus (house mouse) 
Mus musculus 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



CDS 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 

1 (bases 1 to 1598) 

Wittenberger,T. , Schaller , H . C . and Hellebrand, S . 

An expressed sequence tag (EST) data mining strategy succeeding in 

the discovery of new G-protein coupled receptors 

J. Mol. Biol. 307 (3), 799-813 (2001) 

21172992 

11273702 

2 (bases 1 to 1598) 

Wittenberger , T . , Schaller , C . H . and Hellebrand, S . 
Direct Submission 

Submitted ( 14-AUG-2000 ) ZMNH, Institut fur 

Entwicklungsneurobiologie, Martinistr. 52, Hamburg 20246, Germany 
Location/Qualif iers 
1. .1598 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL" 
/db_xref= M taxon: 10090" 
74. .1027 

/note="orphan receptor" 
/codon_start=l 

/product="G-protein coupled receptor GPR91" 
/protein_id="AAK01867 . 1" 
/db_xref="GI: 12711491" 

/ trans la tion="MAQNLSCENWLATEAILNKYYLSAFYAIEFIFGLLGNVTWFGY 
LFCMKNWNSSNVYLFNLSISDFAFLCTLPILIKSYT^NDKGTYGDVLCISNRYVLHTNL 
YTSMLLLTVISMDRYLLMKYPFREHFLQKKEFAILISLAVWALVTLEVLPMLTFINSV 
PKEEGSNCIDYASSGNPEHNLI YSLCLTLLGFLIPLSVMCFFYYKMWFLKRRSQQQA 
TALPLDKPQRLWLAWIFSILFTPYHIMRNLRIASRLDSWPQGCTQKAIKSIYTLTR 
PLAFLNSAINPIFYFLMGDHYREMLISKFRQYFKSLTSFRT" 



ORIGIN 



Query Match 99.4%; 
Best Local Similarity 99.6%; 
Matches 1537; Conservative 



Score 1533.4; 
Pred. No. 0; 
0; Mismatches 



DB 10; 

6; Indels 



Length 1598; 

0; Gaps 



0; 



Qy 

Db 



1 GCT C CT GGCAGAGT T T T C T GT C GAGAC AGAAGC C GACAGCAGAAT GG CAC AGAAT T TAT C 60 
M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
31 GCT C CT GGCAGAGT T T T C T GT C GAGACAGAAGC C GACAG CAGAAT G GCAC AGAAT T TAT C 90 



Qy 



Db 



61 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 120 

1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I II I I I 
91 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 150 



Qy 

Db 

Qy 
Db 

Qy 

Db 



121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I II I I I I I I 
151 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTTGGCTACCTCTT 210 



181 



240 



CT G CAT GAAGAACT G GAACAG C AG CAAT GT CT AT CT T T T T AAC CT TT C CAT CT CT GACTT 
I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I II I I I I I I I I I I 
211 CT G CAT GAAGAACT G GAACAG C AG CAAT GT CT AT CT T T T T AAC CT T T C CAT CT CT GACTT 27 0 

241 TGCTTTCCTGT GCAC C CT T C C CAT C CT GAT AAAG AGT T AT GC CAAT GAT AAGG G GAC CT A 300 

I II I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
271 TGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAATGATAAGGGGACCTA 33 0 



Qy 301 T G GAGAT GT T CT CT GT ATAAG CAAC C GAT AT GT G CT T C AC AC CAAC CT C T AC AC CAGC AT 360 

I I I I I I I I I I I I I I I I II I I I I I I! I I I I I I I I I I I I I I I II I I I II 1 I I I I I I I I I I I I 
Db 331 T G GAGAT GT T C T CT GT AT AAGCAAC C GAT AT GT GC T T C AC AC CAAC CT CT AC AC CAGC AT 390 

Qy 361 CCTCTTCCT C AC T T T CAT T AGCAT G GAC C GAT AT CT GC T CAT GAAGT AC C C T T T CC GAGA 420 

II I II I I II I I I I I I I I I I II I I I I I I I I I I II I I I 1 I I I I I I I i II I I I I I I I I I I 
Db 391 GCTCTTGCT C AC T GT CAT TAG CAT G GAC C GAT AT CT G CT CAT GAAGT AC C CT T T C C GAGA 450 

Qy 421 ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 48 0 

I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I II I I I 
Db 451 ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 510 

Qy 4 81 GAC CT T AGAAGT T CT AC C CAT GCT C ACT T TC AT CAAT T C T GT C C C AAAAGAAGAGGGCAG 540 

I I I I I I I I I I I I I I I I I I I I I I I M I I M II I I I I I I I I I I I I I I I I I I I I I II I I I II I 
Db 511 GAC CT T AGAAGT T CT AC C CAT GCT C AC T T T CAT CAAT T CT GT C C CAAAAGAAGAGGGC AG 57 0 

Qy 541 T AACT G CAT C GACT AT GC AAGT T CT GGAAAC CC T GAACACAAT CT CAT T T AC AGC CT CT G 600 

I I I II I I I I I II I I I I I I I II I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I 
Db 571 T AAC T GC AT C GACT AT GCAAGT T CT GGAAAC CCT GAACACAAT C T C AT TT AC AGC C T CTG 630 

Qy 601 CCT GACT TTGTTGGGCTTCCTAATT CCT CTCTCTGTGATGTGCTTCTTCTACTACAAGAT 660 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I 
Db 631 CCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGAT 690 

Qy 661 GGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACC 720 

M I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I II I I I I I II I I I I I I I I I I I I I I I 
Db 691 GGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACC 750 

Qy 721 CCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTTCACACCCTATCATAT 780 

I I I I I I I I I I I I I I I M I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 51 CCAACGCCTGGTGGTCCTGGCAGTTGTGATCTTCTCTATACTCTTCACACCCTATCATAT 810 

Qy 7 81 C AT GC GCAAT T T GAG GAT C GC C T CAC G C CT G GAT AGTT G GC CACAAGGAT GT AC ACAGAA 84 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 811 CAT GC GCAAT T T GAGGAT C GC CT C AC GC C T G GAT AGT T G GC CACAAGGAT GT AC ACAGAA 870 

Qy 841 GGCC AT CAAAT CT AT AT AC AC ACT GACAC GG CCTCTGGCCTTT CT GAACAGT GC C AT CAA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 871 G G CC AT CAAAT CT ATAT AC AC ACT GACAC GGC CT CT GGCCTTTCT GAACAGT G C CAT CAA 930 

Qy 901 T C CC AT CT T C TACT T C CT CAT GGGAGAC CAT T ACAGAG AGAT GCT GAT T AGT AAGT T C AG 960 

I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I II I I I I 
Db 931 T C CC AT CT T CT ACT T C CT C AT GG GAGAC CAT T ACAGAGAGAT GCT GAT T AGT AAGT T C AG 990 

Qy 961 ACAAT ACT T CAAGT C C CT T ACAT C CT T C AGGACAT GAG CT G CT G GAT GCAG GT CT T C ACT 1020 

I I I I I I I I I I I I I II I M I I I I II I II I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 991 ACAAT ACT T CAAGT C C C T T AC AT C CT T C AGGACAT GAGCT GCT GGAT GCAGGT CT T C ACT 1050 

Qy 1021 CAGC C AAAAT GAGACACT T GAT AAAC AGT GCT GT G CAGT T GAGT T T T AACTAAGT AAACC 108 0 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1051 CAGC CAAAAT GAGACACT T GAT AAAC AGT GCT GT GCAGT T GAGT T T T AACTAAGTAAAC C 1110 



Qy 10 81 ACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTG 114 0 

I M I I I I I I I I II I I II I I I I I I I I I I I I I I I I II I I I I I II I I I I I M I I I I I I I I II 
Db 1111 ACCATTTCTACGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTG 117 0 



Qy 1141 GGT C C AC AT GAAT C AGAAG GC AG CTCTCTGTTCT G ATT T T AGGT T AT AC CC AGAGT AT G G 1200 

I I I I I I I 1 I I I I I I II I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I II I I I I I I I I I I 
Db 1171 GGT C C AC AT GAAT C AGAAGG CAGCT C T C TGTT C T GAT T T TAG GT TAT AC C C AGAGTAT G G 1230 

Qy 12 01 AAAAAATAAGGCAT GAGAAAGCATT GACAT CT T CACTTAAGAACT GAACAAAAGAGAACA 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I II I I I I I I I I I I I || I 
Db 1231 AAAAAATAAGGCAT GAGAAAGCATT GACAT CTT CACTTAAGAACTGAACAAAAGAGAACA 12 90 

Qy 12 61 AATAT T GT CAAT GT T T G GACACTTAG GAT CT GAAAT C T T G GAAAT T T T AAGAC CT C T T T T 132 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I I I II II I I I I I II I I I I I I 
Db 1291 AATAT T GT CAAT GT T T GGAC ACT T AG GAT CT GAAAT CTT GGAAAT T T TAAGAC CT CT T T T 1350 

Qy 1321 T CT AT CAGT GTAAAAGGAAT ACAAGAT AGC T AGT T GCAAAT GC T GAAT GCAT T T CAT CAT 138 0 

I I I I I I I I I I I I I I I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I 
Db 1351 T CTAT CAGT GTAAAAGGAAT ACAAGAT AGCTAGTT GCAAAT GCT GAAT GCATTTCAT CAT 1410 

Qy 1381 T GGT CAGGTCGATAAGCGTGTTTCT GAAAT AGT CTTATTTTTATTCTTGTAATATTAAAA 1440 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1411 T GGT C AGGT C GATAAGC GT GT T T C T GAAAT AGT CT T AT T T T TAT T CTT GT AAT ATT AAAA 147 0 

Qy 1441 TTTAT GT GAAAAAT GAATATAATT CAAT GTACAACATT AGATTTTCTATTT GAAAATTAT 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
Db 1471 TTTAT GT GAAAAAT GAATATAATT CAAT GTACAACATT AGATT TTCTATTTGAAAATT AT 1530 

Qy 1501 AT TT CT T GAAAAAATAAC TGCTGTGCC TAAAT AAAT CAAT AT A 1543 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I 
Db 1531 AT TT CTT GAAAAAATAAC T GCT GT GC CTAAAT AAAT CAAT AT A 1573 



RESULT 3 

AC138318/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



AC138318 202487 bp DNA linear HTG 18-DEC-2003 

Mus musculus chromosome 3 clone RP23-358I23 map 3, *** SEQUENCING 
IN PROGRESS 7 unordered pieces. 

AC138318 

AC138318.4 GI:40018777 

HTG; HTGS_PHASE1; HTGS_FULLTOP ; HTGS_ACTIVEFIN . 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 202487) 
Birren,B., Nusbaum,C, and Lander, E. 

Mus musculus chromosome 3, clone RP23-358I23 
Unpublished 

2 (bases 1 to 202487) 

Birren,B., Nusbaum,C, Lander, E., Ali,A., Allen, N., Anderson, S . , 
Barna,N., Bastien,V., Bloom, T., Boguslavkiy, L . , Boukhgalter , B . , 
Camarata,J., Chang, J., Chazaro,B., Choepel,Y., Collymore, A. , 
Cook, A., Cooke, P., DeArellano, K. , Dewar,K., Diaz, J. S., Dodge, S., 
Faro,S., Ferreira,P., Fit zGerald,M. , Gage,D., Galagan,J., 
Gardyna,S., Gord,S., Graham,L., Grand-Pierre, N . , Hafez, N., 
Hagos,B., Horton,L., Hulme,W., Iliev, I., Johnson, R. , Jones, C, 
Kamat,A., Karatas,A., Kells,C, Landers, T., Levine,R., 
Lindblad-Toh, K. , Liu, G . , MacLean,C, Macdonald, P . , Major, J. , 
Matthews, C, McCarthy, M. , Meldrim,J., Meneus,L., Mihova,T., 
Mlenga,V., Murphy, T., Naylor,J., Nguyen, C, Nicol,R., Norbu,C, 



Norman, C.H., O'Connor, T., 0' Donnell, P . , 0'Neil,D., Oliver, J., 
Peterson, K. , Phunkhang, P . , Pierre, N . , Raymond, C . , Retta,R., 
Rise,C, Rogov,P., Roman, J. , Roy, A., Schauer,S., Schupback, R. , 
Seaman, S., Severy,P., Smith, C, Spencer, B., Stange-Thomann,N. , 
Stojanovic,N. , Talamas,J., Tesfaye,S., Theodore, J. , Topham,K., 
Travers,M., Vassiliev, H . , Viel,R., Vo,A. , Wilson, B . , Wu,X., 
Wyman,D., Young, G., Zainoun,J., Zembek,L., Zimmer,A. and Zody,M. 
TITLE Direct Submission 

JOURNAL Submitted (25-DEC-2002 ) Whitehead Institute/MIT Center for Genome 
Research, 320 Charles Street, Cambridge, MA 02141, USA 
REFERENCE 3 (bases 1 to 202487) 

AUTHORS Birren,B., Nusbaum, C . , Lander, E. , Abouelleil, A. , Allen, N. , 
Anderson, M. , Arachchi , H .M. , Barna,N., Bastien,V. , Bloom, T., 
Boguslavkiy, L . , Boukhgalter , B . , Camarata,J., Chang, J., Choepel,Y., 
Collymore, A. , Cook, A. , Cooke, P., Corum, B., DeArellano, K. , 
Diaz, J. S., Dodge, S., Dooley,K., Dorris,L., Erickson,J., Faro,S., 
Ferreira,P., FitzGerald, M. , Gage,D., Galagan,J., Gardyna,S., 
Graham, L., Grand-Pierre, N . , Hafez, N., Hagopian,D., Hagos,B., 
Hall, J., Horton,L., Hulme,W., Iliev,I., Johnson, R. , Jones, C, 
Kamat,A., Karatas,A., Kells,C, Landers, T., Levine,R., 
Lindblad-Toh, K . , Liu,X., Lui,A., Mabbitt,R., MacLean,C, 
Macdonald, P . , Major, J., Manning, J. , Matthews, C, McCarthy, M. , 
Meldrim, J., Meneus,L., Mihova,T., Mlenga,V. , Murphy, T., Naylor,J., 
Nguyen, C, Nicol,R., Norbu,C, 0'Connor,T., 0 1 Donnell , P . , 
O f Neil,D., Oliver, J., Peterson,K., Phunkhang, P . , Pierre,N., 
Rachupka,A., Ramasamy,U., Raymond, C, Retta,R., Rise,C, Rogov, P., 
Roman, J., Schauer,S., Schupback, R. , Seaman, S., Severy,P., Smith, C. , 
Spencer, B., Stange-Thomann, N . , Sto j anovic, N . , Stubbs,M., 
Talamas,J., Tesfaye,S., Theodore, J., Topham, K., Travers,M., 
Vassiliev, H . , Venkataraman, V. S . , Viel,R., Vo,A. , Wilson, B., Wu,X., 
Wyman,D., Young, G., Zainoun, J. , Zembek,L., Zimmer,A. and Zody,M. 

TITLE Direct Submission 

JOURNAL Submitted ( 18-DEC-2003) Whitehead Institute/MIT Center for Genome 
Research, 320 Charles Street, Cambridge, MA 02141, USA 
COMMENT On Dec 18, 2003 this sequence version replaced gi: 29150501. 

All repeats were identified using RepeatMasker : 
Smit, A.F.A. & Green, P. (1996-1997) 

http : //ftp. genome .washington.edu/RM/RepeatMasker .html 
Genome Center 

Center: Whitehead Institute/ MIT Center for Genome Research 

Center code: WIBR 

Web site: http://www-seq.wi.mit.edu 

Contact : sequence_submissions@genome . wi .mit . edu 

Project Information 

Center project name: L28921 
Center clone name: 358 I 23 



NOTE: This is a 1 working draft 1 sequence. It currently 
consists of 7 contigs . The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are' represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 
be preserved. 

1 68110: contig of 68110 bp in length 
68111 68210: gap of 100 bp 
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FEATURES Location/Qualifiers 
source 1. .202487 

/organism="Mus musculus" 
/mo l_t ype= " genomi c DNA" 
/db_xref="taxon: 10090" 
/ chromosome="3" 
/map="3" 

/clone="RP23-358I23" 

/clone_lib="RPCI-23 Female Mouse BAC" 

ORIGIN 



Query Match 96.9%; Score 1494.8; DB 2; Length 202487; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 14 96; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 46 GGCACAGAATTTAT CTT GT GAGAAT T GGTT GGCAACAGAGGCTAT CT T GAATAAGTACTA 105 

I I I I I I I I I I I II i M I I I I I I II II I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I 
Db 105379 G G C AC AGAAT T TAT CT T GT GAGAAT T GGT T GGCAACAGAG GC T AT CTT GAATAAGTACTA 

105320 



Qy 106 CCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGT 165 

I I I I I I I I I I I i I I I II I I I I I I i I I I I I I I I I I I I I II I I II I I I I I i M I I I I I I I I I 
Db 105319 CCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGT 

105260 



Qy 166 GTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCT 225 

III I I I I I I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1052 59 GTTTGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCT 

105200 



Qy 226 TTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAA 285 

I 1 I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 105199 TTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAA 

105140 

Qy 28 6 TGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTCACACCAA 345 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 10513 9 TGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTCACACCAA 

105080 



Qy 34 6 C CT CT ACAC CAGCAT CCTCTTCCT C ACT T T CAT TAG CAT G GAC C GAT AT C T GC T CAT GAA 405 

I I I I II I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 10507 9 C CT CT ACAC CAGCAT CCTCTTCCT C ACT T T CAT T AGC AT GGAC C GAT AT C T G C T CAT GAA 

105020 



Qy 4 06 GTACCCTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGC 4 65 



Db 

104960 

Qy 

Db 

104900 

Qy 

Db 

104840 

Qy 

Db 

104780 

Qy 

Db 

104720 

Qy 

Db 

104660 

Qy 

Db 

104600 

Qy 

Db 

104540 

Qy 

Db 

104480 

Qy 

Db 

104420 

Qy 

Db 

104360 

Qy 



105019 GTACCCTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGC 



4 66 TGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCC 525 
I I I I I I I II I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
104959 TGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCC 



52 6 AAAAGAAGAGGG C AGT AACT GC AT C GAC TAT GCAAGT T C T GGAAAC CCT GAAC ACAAT CT 585 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
104 8 99 AAAAGAAGAG G G C AGT AACT GC AT C GACT AT G CAAGT T CT GGAAAC CCT GAAC ACAAT C T 



586 CATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTT 645 
I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I 
104839 CATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTT 



64 6 CTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCT 7 05 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
104779 CTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCT 



706 GCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTT 765 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I 
104 719 GCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCAGTTGTGATCTTCTCTATACTCTT 



7 66 C AC AC C C TAT CAT AT CAT GC GCAAT TT G AGGAT CG C CT CAC GC CT GGAT AGTT GGC C AC A 825 

I 1 I I I I I I I I I I I I I 1 I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I II I I I I I I 
104659 C ACAC C C TAT CAT AT CAT GC GCAAT TT GAGGAT CGC CT CAC GC CT G GAT AGTT GGC CAC A 



826 AGGATGTACACAGAAGGCCATCAAATCTATATACACACTGACACGGCCTCTGGCCTTTCT 885 
I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
104599 AGGAT GT AC AC AGAAG GC C AT CAAAT CT AT AT ACACACT GACAC GGC CT CTGGCCTTTCT 



886 GAAC AGT GC CAT CAAT C C C AT CT T CT ACT T CCT CAT GGGAGAC CAT T AC AGAGAGAT GCT 945 

I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
104539 GAAC AGT GC CAT CAAT C C CAT CT T CT ACT T CCT CAT GG GAGAC CAT T ACAGAGAGAT GCT 



94 6 GAT T AGT AAGT T CAGACAAT AC T T CAAGT C C C T T ACAT CCT T CAG GAC AT GAGCT GCT G G 1005 
II I I I I I 1 I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II 
10447 9 GATT AGT AAGTT CAGACAAT ACT T CAAGT CCCTT ACAT CCTTCAGGAC AT GAGCT GCT GG 



1006 AT G C AGGT C T T CACT CAG C CAAAAT GAG AC ACT T GAT AAAC AGT G CT GT G CAGT T GAGTT 1065 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
104419 AT GCAGGTCTT CACT CAGCCAAAATGAGACACTTGATAAACAGTGCTGTGC AGTT GAGTT 



1066 TT AACT AAGT AAACCACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGG 1125 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 



Db 

104300 



10435 9 TTAACTAAGTAAACCACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGG 



Qy 1126 CTGGAGTACAAGCTGGGTCCACATGAATCAGAAGGCAGCTCTCTGTTCTGATTTTAGGTT 1185 

II I I I I I M I I II I I I I ! I I I M I I I I I I I I I I I I I I I I I I M I II I I I II I II I I I I II 
Db 104299 CTGGAGTACAAGCTGGGTCCACATGAATCAGAAGGCAGCTCTCTGTTCTGATTTTAGGTT 

104240 

Qy 1186 ATACC CAGAGT AT GGAAAAAATAAGGC AT GAGAAAGCATT GACAT CTT CACTTAAGAACT 124 5 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II 
Db 104239 ATACCCAGAGT AT GGAAAAAATAAGGCAT GAGAAAGCATT GACAT CTTCACTTAAGAACT 

104180 

Qy 124 6 GAACAAAAGAGAACAAAT AT T GT CAAT GT T T GGAC ACT T AGGAT CT GAAAT CT T GGAAAT 1305 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104179 GAACAAAAGAGAACAAAT AT T GT CAAT GT T T GGAC ACT T AGGAT CT GAAAT CTT GGAAAT 

104120 

Qy 1306 TTTAAGACCTCTTTTTCTATCAGTGTAAAAGGAATACAAGATAGCTAGTTGCAAATGCTG 1365 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 104119 T T TAAGAC CTCTTTTT CT AT C AGT GTAAAAG GAAT ACAAGAT AG CT AGT T GCAAAT GCT G 

104060 

Qy 1366 AAT GC AT T T CAT CAT T GGT C AG GT CGAT AAG CGTGTTTCT GAAAT AGT C T TAT T T T TAT T 1425 

I II I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I II I I I 
Db 104 059 AAT GC AT T T CAT CAT T GGT CAG GT C GAT AAGC GT GTTT CT GAAAT AGT CT T AT T T TT AT T 

104000 

Qy 1426 CTT GT AAT AT T AAAAT T TAT GT GAAAAAT GAAT AT AATT CAAT GT AC AAC AT T AGAT T T T 1485 

I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I II 
Db 103999 CTT GT AAT AT T AAAAT T TAT G T GAAAAAT GAAT AT AAT T CAAT G T AC AAC AT TAG AT T T T 

103940 

Qy 14 86 CTATTT GAAAATTATATTTCTT GAAAAAATAACT GCT GT GCCTAAATAAAT CAAT AT A 1543 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 103939 CTATTT GAAAATTAT ATTT CTT GAAAAAATAACT GCT GT GCCTAAATAAAT CAAT AT A 103882 
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LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



AC111231 239576 bp DNA linear HTG 13-MAY-2003 

Rattus norvegicus clone CH230-96O13, *** SEQUENCING IN PROGRESS 
*** , 2 unordered pieces. 
AC111231 

AC111231.7 GI: 30578486 

HTG; HTGS_PHASE1; HTGS_DRAFT; HTGS_ENRICHED . 
Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 239576) 

Muzny, D.Marie. , Metzker, M. Lee . , Abramzon,S., Adams, C, Alder, J., 
Allen, C, Allen, H-, Alsbrooks , S . , Amin,A. , Anguiano,D., 
Anyalebechi, V. , Aoyagi,A. , Ayodeji,M., Baca,E., Baden, H., 
Baldwin, D., Bandaranaike, D . , Barber, M., Barnstead,M. , Benahmed, F . , 
Biswalo,K., Blair, J., Blankenburg, K . , Blyth,P., Brown, M. , 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



Bryant, N., Buhay,C, Burch,P., Burrell,K., Calderon, E. , 
Cardenas, V., Carter, K., Cavazos,I., Ceasar,H., Center, A., 
Chacko,J., Chavez, D . , Chen,G., Chen,R., Chen,Y., Chen,Z., Chu,J., 
Cleveland, C. , Cockrell,R., Cox,C, Coyle,M., Cree,A. , D ! Souza,L., 
Davila,M.L., Davis, C, Davy-Carroll, L . , DeAnda,C, Dederich,D., 
Delgado,0., Denson,S., Deramo,C, Ding,Y., Dinh,H., Divya,K., 
Draper, H., Dugan-Rocha, S . , Dunn, A., Durbin,K., Duval, B., Eaves, K. , 
Egan,A., Escotto,M., Eugene, C, Evans, C. A., Falls, T., Fan,G., 
Fernandez, S . , Finley,M., Flagg,N., Forbes, L., Foster, M. , Foster, P., 
Fraser,C.M., Gabisi,A. , Ganta,R., Garcia, A w Garner, T., Garza, M. , 
Gebregeorgis,E. , Geer,K., Gill,R., Grady, M. , Guerra,W., Guevara, W., 
Gunaratne, P . , Haaland,W., Hamil,C, Hamilton, C . , Hamilton, K . , 
Harvey, Y., Havlak,P., Hawes,A. , Henderson, N . , Hernandez, J . , 
Hernandez, R. , Hines,S., Hladun,S.L., Hodgson, A., Hogues,M., 
Hollins,B., Howells,S., Hulyk,S., Hume, J., Idlebird,D., Jackson, A., 
Jackson, L., Jacob, L., Jiang, H. , Johnson, B., Johnson, R. , Jolivet,A., 
Karpathy,S., Kelly, S., Kelly, S . , Khan,Z,, King,L., Kovar,C, 
Kowis,C, Kraft, C.L., Lebow,H., Levan,J., Lewis, L., Li,Z., Liu, J., 
Liu, J., Liu,W., Liu, Y . , London, P., Longacre,S., Lopez, J., 
Lorensuhewa, L . , Loulseged, H . , Lozado,R.J., Lu,X., Ma, J., 
Maheshwari,M. , Mahindartne, M. , Mahmoud,M. , Malloy,K., Mangum,A., 
Mangum,B., Mapua,P., Martin, K. , Martin, R. , Martinez, E., 
Mawhiney,S., McLeod,M.P., McNeill, T . Z . , Meenen,E., 
Milosavl jevic, A. , Miner, G., Minja,E., Montemayor, J. , Moore, S., 
Morgan, M., Morris , K . , Morris, S., Munidasa,M., Murphy, M., Nair,L., 
Nankervis, C. , Neal,D., Newton, N., Nguyen, N . , Norris,S., 
Nwaokelemeh, O. , Okwuonu,G., Olarnpunsagoon, A. , Pal,S., Parks, K. , 
Pasternak, S . , Paul,H., Perez, A., Perez, L,, Pf annkoch, C . , 
Plopper,F., Poindexter, A. , Popovic,D., Primus, E., Pu,L.-L., 
Puazo,M. , Quiroz,J., Rachlin,E., Reeves, K., Regier,M.A., Reigh,R., 
Reilly,B., Reilly,M., Ren,Y., Reuter,M., Richards, S., Riggs,F., 
Rives, C, Rodkey,T., Rojas,A., Rose,M., Rose,R., Ruiz, S. J., 
Sanders, W. , Savery,G., Scherer,S., Scott, G., Shatsman,S., Shen,H., 
Shetty,J., Shvartsbeyn, A. , Sisson,I., Sitter, CD., Smajs,D., 
Sneed,A., Sodergren, E . , Song,X.-Z., Sorelle,R., Sosa,J., 
Steimle,M., Strong, R. , Sutton, A., Svatek,A., Tabor, P., Taylor, C. , 
Taylor, T., Thomas, N., Thomas, S., Tingey,A., Trejos,Z., Usmani,K., 
Valas,R., Vera,V., Villasana, D . , Waldron,L., Walker, B., Wang, J. , 
Wang,Q., Wang,S., Warren, J., Warren, R. , Wei,X., White, F. , 
Williams, G., Willson,R., Wleczyk,R., Wooden, H., Worley,K., 
Wright, D., Wright, R. , Wu,J., Yakub,S., Yen, J., Yoon,L., Yoon,V. , 
Yu,F., Zhang, J., Zhou, J., Zhou,X., Zhao,S., Dunn,D., von 
Niederhausern, A. , Weiss, R. , Smith, D.R., Holt, R. A., Smith, H.O., 
Weinstock,G. and Gibbs,R.A. 
Direct Submission 
Unpublished 

2 (bases 1 to 239576) 
Worley,K.C. 

Direct Submission 

Submitted ( 19-FEB-2002 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

3 (bases 1 to 239576) 

Rat Genome Sequencing Consortium. 
Direct Submission 

Submitted ( 13-MAY-2003 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 



Baylor Plaza, Houston, TX 77030, USA 
COMMENT On May 13, 2003 this sequence version replaced gi: 24819079. 

The sequence in this assembly is a combination of BAC based reads 
and whole genome shotgun sequencing reads assembled using Atlas 
(http://www.hgsc.bcm.tmc.edu/projects/rat/). Each contig described 
in the feature table below represents a scaffold in the Atlas 
assembly (a 1 contig-scaf f old 1 ) . Within each contig-scaf f old, 
individual sequence contigs are ordered and oriented, and separated 
by sized gaps filled with Ns to the estimated size. The sequence 
may extend beyond the ends of the clone and there may be sequence 
contigs within a contig-scaf fold that consist entirely of whole 
genome shotgun sequence reads. Both end sequences and whole genome 
shotgun sequence only contigs will be indicated in the feature 
table . 

Genome Center 

Center: Baylor College of Medicine 
Center code: BCM 

Web site: http://www.hgsc.bcm.tmc.edu/ 

Contact: hgsc-help@bcm.tmc.edu 
Project Information 

Center project name: GLVO 

Center clone name: CH230-96O13 
Summary Statistics 

Assembly program: Atlas 3.0; 

Consensus quality: 213738 bases at least Q40 

Consensus quality: 217471 bases at least Q30 

Consensus quality: 220066 bases at least Q20 

Estimated insert size: 227472; sum-of-contigs estimation 

Quality coverage: 6x in Q20 bases; sum-of-contigs estimation 



FEATURES 

source 



misc feature 



misc feature 



NOTE: Estimated insert size may differ from sequence length 

(see http://www.hgsc.bcm.tmc.edu/docs/Genbank__draft_data.html) . 
NOTE: This is a 1 working draft 1 sequence. It currently 
consists of 2 contigs. The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 
be preserved. 

1 236521: contig of 236521 bp in length 
236522 236621: gap of unknown length 
236622 239576: contig of 2955 bp in length. 
Location/Qualifiers 
1. .239576 

/organism="Rattus norvegicus" 
/ mo l__type— "genomic DNA" 
/db_xref="taxon: 10116" 
/clone= M CH230-96O13" 
157219. .158900 
/note= f, wgs_contig" 
206334. .207349 
/note= 1, wgs_contig" 



ORIGIN 



Query Match 67.5%; Score 1041.6; DB 2; Length 239576; 

Best Local Similarity 85.2%; Pred. No. 1.2e-221; 



Matches 1287; Conservative 0; Mismatches 194; Indels 29; Gaps 10; 

Qy 4 6 GG CAC AGAAT T TAT C T T GT GAGAAT T GGT T G GCAACAGAGGCT AT CT T GAAT AAGTACT A 105 

I I I I I I I I I I I I I I I I I I I I I I I I I I I Mill I I I I III I I I I I I II I I I I I 
Db 92574 GG C ACAGAAT T TAT C T T GT GAAAAT T GG C T GG C AT T AGAGAAT AT T T T GAAAAAGT AC T A 92515 

Qy 106 CCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGT 165 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I III II I I Mill 

Db 92514 CCTCTCTGCATTTTATGGGATCGAGTTCATTGTTGGAATGCTTGGCAATTTCACCGTGGT 92455 

Qy 166 GTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCT 225 

M M I II II I I I I II II I II II II M M I I II II II I I I II I I II I II I II II M I 
Db 92454 GTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGTAGCAACGTCTATCTCTTCAACCT 92395 

Qy 22 6 TTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAA 285 

II II I II II M II M I I I M I I II I I II I II M M II M M M I II II II II II 

Db 92394 TTCCATCTCTGACCTTGCTTTCCTGTGCACGCTTCCCATGCTGATAAGGAGTTACGCCAC 92335 

Qy 286 T GATAAGGGGAC CTATGGAGAT GTT CT CT GTATAAGCAACCGATAT GT GCTT CACACCAA 345 

M II II I I I I I II II II II I II I M I I I I I II II I II II II I II I II I MM 
Db 92334 TGGGAACTGGAC CTATGGAGAT GTT CTCTGCATAAGCAACCGTTATGT GCTT CAT GCCAA 92275 

Qy 34 6 C CT CT ACAC C AGCAT CCTCTTCCT CACT T T CAT T AGC AT G GAC CGAT AT CT GCT CAT GAA 405 

I M I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I 

Db 92274 C C T CT AC AC C AG CAT CCTTTTCCT CACT T T CAT T AGCAT AGAC CGAT AT CT GCT CAT GAA 92215 

Qy 406 GTACCCTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGC 465 

II I I I II I I I I I I I I M I I I I I II I M I I I II I I I II I I I I II II II I I I I I I I I I I 

Db 92214 GTTCCCTTTCC GAGAAC AC AT T CT ACAAAAGAAGGAAT T T GC C AT TT T AAT CTCCCTGGC 92155 

Qy 4 66 TGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCC 525 

I I II I I M I II I I I I I II I I I I II I I I I I II I I II I II I I II I I II II I III 

Db 92154 T GT C T GGGT CT T AGT GAC CT T AGAAGT T CT AC C TAT GCT CAC GT T TAT C ACT T C CAC C C C 92095 

Qy 52 6 AAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAACACAATCT 585 

II I I I I I I II I I I I I II I I I I I I I I I I I I I I I II I I I I I II II I I I I II 

Db 92094 AATAGAAAAGGGCGACAGCTGTGTCGACTATGCAAGTTCTGGAAACCCTAAATACAGTCT 92035 

Qy 58 6 CATTTACAGCCTCTGCCTGACTTT GTT GGGCTTCCTAATTCCTCTCTCTGT GAT GT GCTT 645 

I II I I I I I I II I M I I II I II I I I I I I I II I I II I II I I I I I I I I I I I I II II I I 

Db 92034 CATTTACAGCCTGTGCCTGACTTTGCTGGGCTTCCTCATTCCTCTGTCTGTAATGTGCTT 91975 

Qy 64 6 CTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCT 705 

II I I I I I II I II I I I I II I I II II I I I I II I I II I I I M I I I I I I II I II II II 

Db 91974 CTTCTACTACAAAATGGTAGTCTTCCTAAAGAAGAGGAGCCAGCAGCAGGCAACTGTGCT 91915 

Qy 706 GCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTT 765 

I II I I I II I I I I I I M I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I II 
Db 91914 ATCGCTGAACAAACCTCTGCGCCTGGTGGTCCTGGCAGTGGTGATCTTCTCTGTACTCTT 91855 

Qy 7 66 C ACAC C CT AT CAT AT CAT GC GCAAT T T GAG GAT C GC CT CAC G C CT GGAT AGT T GGCC AC A 825 

II I I I II II I I I I I II I I I I I I I I II II I II I I I I I I I I I I M I I II I I II II 
Db 918 54 T ACAC CT T AC CAT AT CAT GC G C AAT GT GAG GAT T GC CT CAC G CT T GGAT AGCT G G C CAC A 91795 

Qy 82 6 AG GAT GT ACAC AGAAGGC CAT C AAAT CT AT AT AC ACACT GACAC G GC CT CTGGCCTTTCT 885 

II I I I I I I II I I I I I I I I I I I II I M I I I Mill I II II I I M I II I I I 
Db 91794 G G GAT GT T C C C AGAAGGC CAT CAAAT GCT TAT AC AT C CT GAC C AGAC CTCTGGCCTTTCT 91735 



Qy 

Db 



886 
91734 



945 
91675 



Qy 946 GAT TAGT AAGT T C AGAC AATAC T T CAAGT C C C TT ACAT C C T T C AG GACAT GAGCT GCT GG 1005 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III II 

Db 91674 GTTTAGTAAGTTGAGACAATACTTCAAGTCCCTTACGTCCTTCAGGCTCTGACCT A 91619 

Qy 1006 AT G CAG GT CT T C AC T C AG C CAAAAT GAGAC ACTT G ATAAACAGT GCT GT G CAGT T GAGT T 1065 

M I I I I I I I I I I I I I I I I I III I I I I I || I II I I I I I I I I I 

Db 91618 ATGTAGGT CTT C ACT GAGC CAGAATAAGACT C AACTCTGCAGTTGAGTT 91570 

Qy 1066 T TAACTAAGT AAAC C AC CAT T T CTAGGCT T T AGC - T TT C C AC CAT C CT C CAAC C C C CAG G 1124 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II II 

Db 91569 T T GAC CAAGT AGAC C ACC AC CT CT AGG C T T T AGC GT T C C C AC CAT C CT C CAAC C CT GAGT 91510 

Qy 1125 GCTGGAGTACAAGCTGGGTCCACATGAATCAGAAG-GCAGCTCTCTGTTCTGATTTTAGG 1183 

III III I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I II I I I I 

Db 91509 GC T AGAG C ACAAACT GGGCAC ACAT GAAT C AGAAGAGCAAC CAT CT GT C C CGAT TT T AG G 914 50 

Qy 1184 TT AT AC C C AGAGT AT GGAAAAAAT AAGG CAT GAGAAAGC AT T GACAT CTT CACTT AAGAA 12 43 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I II I 
Db 914 49 CT GT AC C CAGAGT AT GG- AAAAAT GAGG C C C C AGAAAGCAT T GACAT CTT C ACAT AAGAA 91391 

Qy 1244 CT GAACAAAAGAGAACAAATATT GT CAAT GTTT GGACACTTAGGAT CT GAAATCTTGGAA 1303 

I I II I I II I I I I III II MINIM I I I I I I I I I I I I | | | | | | | | | | 

Db 913 90 CT GAACAAAAGAAAACT GAT GTT GT CAAT AT T T G GACACT TAAGAT C CAAG GCGTT GGAG 91331 

Qy 1304 AT T TT AAGACCT CT T - T T T CT AT CAGT GT AAAAGGAAT ACAAGAT AGCT AGT T GCAAAT G 1362 

I M I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I III I I I I I I I I I I 
Db 91330 AT T T T AAGAC AT CT T CT T T CT AT CAGT GT AAAAGGAAT AC GAGAC AG CTAGT T - CT G ACA 91272 

Qy 1363 CT GAAT G C ATT T CAT CAT T GGT CAGGT C GAT AAG CGT GT T T CT GAAAT AGT C TTAT 1418 

I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 91271 CT GAAT GC ATT T T GT CAT T GGT CAG CT T GAT AAGAAT GTTT CT GAAAT AGT CT CT AT TAT 91212 

Qy 1419 TT T TAT T CTT GT AAT AT TAA- AAT T TAT GT GAAAAAT GAAT AT AATT CAAT GT ACAAC AT 1477 

I I I I I I I I I M I I I I I I II I I I I I I I I I III II II I I II I I I I I I I I I I 
Db 91211 TT T TAT T CT T G CAAT AT T AAC CT T T TAT AT GAAT GGT GAGT AGAACT CAAT GT ACAAC AT 91152 

Qy 14 78 T AGAT T T T C TAT T T GAAAAT T ATAT TT CT T GAAAA AATAACTGCTGTGCCTAAATA 1533 

Ml I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I 

Db 91151 TAG CAAT TAT AT T C AGAAAGT ACAT T T CT T GAAAAAAT GAATAACT GCAAT GC CT AAAT A 91092 

Qy 1534 AAT CAAT AT A 1543 

I I I I I I I I 
Db 91091 AAT CAAC AC A 91082 



RESULT 5 
AC116149 
LOCUS 

DEFINITION 
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VERSION 
KEYWORDS 



AC116149 60298 bp DNA linear HTG 25-MAR-2002 

Mus musculus clone RP24-540E9, LOW-PASS SEQUENCE SAMPLING. 
AC116149 

AC116149.1 GI:19703273 
HTG; HTGS PHASE0. 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 



Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

1 (bases 1 to 60298) 

Birren,B., Linton, L., Nusbaum, C. and Lander, E. 

Mus musculus, clone RP24-540E9 

Unpublished 

2 (bases 1 to 60298) 

Birren,B., Linton, L. , Nusbaum, C . , Lander, E. , Ali , A. , Allen, N., 
Anderson, S., Barna,N., Bastien,V., Bloom, T., Boguslavkiy, L . , 
Boukhgalter, B. , Brown, A., Camarata, J. , Campopiano, A. , Chang, J., 
Chazaro,B., Choepel,Y., Colangelo, M. , Collins, S., Collymore, A. , 
Cook, A., Cooke, P., DeArellano, K. , Dewar,K., Diaz,J.S., Dodge, S., 
Faro,S., Ferreira,P., FitzHugh,W., Gage,D., Galagan,J., Gardyna,S. 
Ginde,S., Gord,S., Goyette,M., Graham, L., Grand-Pierre, N . , 
Hagos,B., Horton,L., Hulme,W., Iliev, I., Johnson,R., Jones, C, 
Kamat,A., Karatas,A., Kells,C, LaRocque,K., Lamazares , R. , 
Landers, T., Lehoczky,J., Levine,R., Lindblad-Toh, K. , Liu, G. , 
MacLean,C, Macdonald, P . , Major, J. , Marquis,N., Matthews, C, 
McCarthy, M. , McEwan,P., McKernan,K., Meldrim, J. , Meneus,L., 
Mihova,T., Mlenga,V. , Murphy, T., Naylor,J., Nguyen, C, Nicol,R., 
Norbu,C, Norman, C.H., 0 ' Connor, T., 0 1 Donnell, P . , 0'Neil,D., 
Oliver, J., Peterson, K. , Phunkhang, P . , Pierre, N., Pollara,V., 
Raymond, C, Retta,R., Rieback,M., Riley, R., Rise,C, Rogov,P., 
Roman, J., Rosetti,M. , Roy, A. , Santos, R., Schauer,S., Schupback, R. , 
Seaman, S., Severy,P., Spencer, B., Stange-Thomann, N . , Sto j anovic, N . 
Strauss, N., Subramanian, A. , Talamas,J., Tesfaye,S., Theodore, J., 
Topham, K. , Travers,M., Travis, N., Trigilio,J., Vassiliev, H . , 
Viel,R., Vo,A., Wilson, B . , Wu,X., Wyman,D., Ye,W.J., Young, G., 
Zainoun,J., Zembek,L., Zimmer,A. and Zody,M. 
Direct Submission 

Submitted (25-MAR-2002 ) Whitehead Institute/MIT Center for Genome 
Research, 320 Charles Street, Cambridge, MA 02141, USA 
All repeats were identified using RepeatMasker : 
Smit, A.F.A. & Green, P. (1996-1997) 

http : / / ftp . genome . Washington . edu/RM/ RepeatMasker . html 
Genome Center 

Center: Whitehead Institute/ MIT Center for Genome Research 

Center code: WIBR 

Web site: http://www-seq.wi.mit.edu 

Contact : sequence__submissions Qgenome . wi .mit . edu 

Project Information 

Center project name: L24912 
Center clone name: 540 E 9 



NOTE: This record contains 77 individual 
sequencing reads that have not been assembled into 
contigs. Runs of N are used to separate the reads 
and the order in which they appear is completely 
arbitrary. Low-pass sequence sampling is useful for 
identifying clones that may be gene-rich and allows 
overlap relationships among clones to be deduced. 
However, it should not be assumed that this clone 
will be sequenced to completion. In the event that 
the record is updated, the accession number will 
be preserved. 
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Query Match 




41.9%; 


Score 


645.8; 


DB 


2; 


Length 


Best Local Similarity 


84.0%; 


Pred. 


No. 2.3e-133; 





Matches 673; Conservative 0; Mismatches 127; Indels 1; Gaps 1; 

Qy 51 AGAAT T T AT CT T GT GAGAAT T GGT T GG CAACAGAGGCT AT CT T GAAT AAGT ACT AC CT CT 110 

MI I I I I I I I I I I I I I I t I I I I II I I I II I I I I I I II I I I I 

Db 3890 AGATCT GAT AT CTCGCCCTGTGGTGGAATTCTCAGGCTATCTT GAAT AAGT ACTACCTCT 394 9 

Qy 111 CTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCG 170 

M I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 3950 CTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTTG 4009 

Qy 171 GCTACCTCTTCTGCATGT^AGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCTTTCCA 230 

I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I 
Db 4010 GCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCTTTCCA 4069 

Qy 231 TCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAATGATA 290 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I 
Db 4 070 TCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAATGATA 4129 

Qy 291 AGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTCACACCAACCTCT 350 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I II I I I I I I I I I 
Db 4130 AGGG GAC CT AT G GAGAT GT T CT CT GT ATAAG C AAC C GAT AT GT G C T T CAC AC CAAC CT C T 4189 

Qy 351 AC AC C AGC AT CCTCTTCCT C ACT T T CAT TAG CAT GGAC C GAT AT CT GCT CAT GAAGT AC C 410 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I M II I I I I I I I 
Db 4190 AC AC C AG CAT CCTCTTCCT CAC T T T CAT TAG CAT GGAC C GAT AT C T GC T CAT GAAGT AC C 4249 

Qy 411 CTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCT 47 0 

I I I I I I II I I I II I I I I I I I || | || || | | | Ml 

Db 42 50 CTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCT 4309 



Qy 471 GGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCCAAAAG 530 

M II M I I I I I I t I I I I I I I I I I I II I I I | | | M I I I I I I I I I I I I I I I I I I M I I I | I I 
Db 4310 GGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCCAAAAG 4369 

Qy 531 AAGAGGG C AGTAAC T G CAT C GACT AT G C AAGT T C T G GAAAC C CT GAACACAAT C T CAT T T 590 

I I M I I I I II I I I I I I I I I I I I I I I I I I I | | || | | || I I I I I I I I I I I M I I M I I I || I 
Db 4370 AAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAACACAATCTCATTT 4429 

Qy 591 ACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCT 650 

I I I M I I I I I M I II I I I I I M I I I I I I I I I I II I I I I I I I I I II I I I I I || I I I I | I | | 
Db 443 0 ACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCT 44 8 9 

Qy 651 ACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCAC 710 

I I I I I I I Ml I I I I M I I I I I I I I I II I I I II I I I I I I I I I II I I I I I I I I I I I I M I I I 
Db 4490 ACTACAAGAT GGTAGT CTT CTTAAAGAGGAGGAGC CAGCAGCAAGCAACT GCCCT GC CAC 454 9 

Qy 711 TGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTTCACAC 770 

I I I I I Mill I I I I I I I I I I II I I II 
Db 4550 TGGAC-AACCCAAACGCCTGGGGGTCCTGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 4 608 

Qy 771 C CT AT CAT AT C AT GC GCAATT T GAGGAT C G C C T C AC GC C T GGATAGT T G GC C ACAAGGAT 830 

Db 4609 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 4668 

Qy 831 GT AC AC AGAAGGC C AT CAAAT 8 51 

II III M 
Db 4669 NNNNNNNNNCGGAGATCTGAT 4689 



RESULT 6 

AC116149/C 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



AC116149 60298 bp DNA linear HTG 25-MAR-2002 

Mus musculus clone RP24-540E9, LOW-PASS SEQUENCE SAMPLING. 
AC116149 

AC116149. 1 GI : 19703273 
HTG; HTGS_PHASE0 . 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

1 (bases 1 to 60298) 

Birren,B., Linton, L., Nusbaum, C. and Lander, E. 

Mus musculus, clone RP24-540E9 

Unpublished 

2 (bases 1 to 60298) 

Birren,B., Linton, L., Nusbaum, C, Lander, E., 
Anderson, S., Barna,N., Bastien,V.,, Bloom, T., 

Boukhgalter , B . , Brown, A. , Camarata,J., Campopiano, A. , Chang, J. , 
Chazaro,B., Choepel,Y., Colangelo,M. , Collins, S., Collymore, A. , 
Cook, A. , Cooke, P., DeArellano, K. , Dewar,K., Diaz, J. S., Dodge, S., 
Faro,S., Ferreira,P., FitzHugh,W., Gage,D., Galagan,J., Gardyna,S, 
Ginde,S., Gord,S., Goyette,M., Graham,L., Grand-Pierre, N . , 
Hagos,B., Horton,L., Hulme,W., Iliev, I., Johnson, R., Jones, C, 
Kamat,A., Karatas,A., Kells,C, LaRocque,K., Lamazares , R. , 
Landers, T., Lehoczky,J., Levine,R., Lindblad-Toh, K. , Liu, G . , 
MacLean,C, Macdonald, P . , Major, J., Marquis, N., Matthews, C, 
McCarthy, M. , McEwan,P., McKernan,K., Meldrim, J., Meneus,L., 



Ali , A. , Allen, N. 
Boguslavkiy, L. , 



Mihova,T., Mlenga,V. , Murphy, T . , Naylor,J., Nguyen, C, Nicol,R., 
Norbu,C, Norman, C.H. , O'Connor, T., 0 ' Donnell, P . , O f Neil,D., 
Oliver, J., Peterson, K. , Phunkhang, P . , Pierre, N. , Pollara,V. , 
Raymond, C, Retta,R., Rieback,M., Riley, R. , Rise,C, Rogov, P., 
Roman, J. , Rosetti,M., Roy, A. , Santos, R. , Schauer,S., Schupback, R. , 
Seaman, S., Severy,P., Spencer,B., Stange-Thomann, N . , Sto j anovic, N . , 
Strauss, N., Subramanian, A. , Talamas,J., Tesfaye,S., Theodore, J., 
Topham,K., Travers,M., Travis, N., Trigilio,J., Vassiliev, H. , 
Viel,R., Vo,A., Wilson, B., Wu,X., Wyman,D., Ye,W.J., Young, G. , 
Zainoun,J., Zembek,L., Zimmer,A. and Zody,M. 
TITLE Direct Submission 

JOURNAL Submitted (25-MAR-2002 ) Whitehead Institute/MIT Center for Genome 
Research, 320 Charles Street, Cambridge, MA 02141, USA 
COMMENT All repeats were identified using RepeatMasker : 

Smit, A.F.A. & Green, P. (1996-1997) 

http : / / ftp . genome . Washington . edu/RM/RepeatMasker . html 
Genome Center 

Center: Whitehead Institute/ MIT Center for Genome Research 

Center code: WIBR 

Web site: http://www-seq.wi.mit.edu 

Contact: sequence_submissions@genome . wi .mit . edu 

Project Information 

Center project name: L24912 
Center clone name: 540 E 9 



* NOTE: This record contains 77 individual 

* sequencing reads that have not been assembled into 

* contigs . Runs of N are used to separate the reads 

* and the order in which they appear is completely 

* arbitrary. Low-pass sequence sampling is useful for 

* identifying clones that may be gene-rich and allows 

* overlap relationships among clones to be deduced. 

* However, it should not be assumed that this clone 

* will be sequenced to completion. In the event that 

* the record is updated, the accession number will 

* be preserved. 
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contig of 681 bp in length 
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contig of 691 bp in length 
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contig of 680 bp in length 
gap of 100 bp 

contig of 699 bp in length 
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* 53168 53267: gap of 100 bp 

* 53268 53966: contig of 699 bp in length 

* 53967 54066: gap of 100 bp 

Query Match 41.1%; Score 633.6; DB 2; Length 60298; 

Best Local Similarity 97.6%; Pred. No. 1.2e-130; 

Matches 664; Conservative 0; Mismatches 14; Indels 2; Gaps 2; 

T AGC AT G GAC C GAT AT CT G CT CAT GAAGT AC C CT T T C C GAG - AACAC T T T CT ACAAAA- G 436 

I I M I M I I I I I II I I I I II I I I I I I I I I I I I I I I Mill I I M I M | | | | | | | M | 

T AGC AT G GAC C GAT AT CT GC T CAT GAAGT AC C C T T C C C GAGAAACACT T T CT ACAAAAN G 36600 

AAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTA 4 96 
M I I I I I I I I I I M I I I I I I I I I I I I I II I I II I II I I I II I I II I I || I | I | | | | | || 
AAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCTTTAGTGACCTTAGAAGTTCTA 36540 

C C CAT GCT C AC T T T CAT C AAT T CT GT C C CAAAAGAAGAG G G C AGTAAC T GC ATC GAC TAT 556 

I M I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I I | | | | | | | | | | | | | m I I I M II I I 

C C CAT GCT C AC T T T CAT CAAT T CT GT C C CAAAAGAAGAGGG C AGTAAC T GC AT C GACT AT 364 80 

GCAAGTTCTGGAAACCCTGAACACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGC 616 
I I I I M I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I | | | | | | 
GCAAGTTCTGGAAACCCTGAACACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGC 36420 

TTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAG 676 
I I M I I I I I I I I I I I II II II I I I I I I I I I I I I I I I | | | | | | | | | | | | || | M I I II I II 
TTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGATGGTAGTCTTCTT7WVG 36360 

AGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTC 736 
M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | M I I I I I I I I I I I 
AGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTC 36300 

C T GG C GGT T GT GAT CT T C T CT AT ACT C T T C AC AC C CT AT CAT AT CAT GC GCAAT TT GAGG 796 

I I I I I I I I I I I I I I I I I I I I I M II I I II I I I I I I I I I I I I M I I I I I | | M M | | | | | 

CT GGC AGT T GT GAT C T T CT CTAT ACT CT T C AC AC C CT AT CAT AT CAT GC GCAAT TT GAG G 36240 
AT CGCCT CACGCCT GGATAGTTGGCCACAAGGAT GTACACAGAAGGC CAT CAAATCTAT A 856 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I M I M I I I I I I I I II I II I I I I I 

AT CGCCT C AC G CC T GGAT AGTT GGC C ACAAGGAT GTACACAGAAG GC CAT CAAAT CT AT A 36180 

TACACACTGACACGGCCTCTGGCCTTTCTGAACAGTGC CAT CAAT CCCATCTTCTACTTC 916 
I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I 
T ACACAC T GAC AC G GC C T CT GGC C T T T C T GAAC AGT G C CAT CAAT C C CAT C TT CT ACT T C 3612 0 

CT CAT G GGAGACC AT T AC AGAGAGAT GCT GAT T AGTAAGTT C AGACAAT ACT T CAAGT C C 976 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I 

C T C AT GG GAGAC CAT T ACAGAGAGAT GCT GAT T AGT AAGT T C AGACAAT ACT T CAAGT C C 36060 
CT T AC AT C CT T CAGGACAT GAG CT GC T G GAT G C AG GT CT T C ACT C AGCCAAAAT GAGAC A 1036 

I I I I I I I I M I I I I I I I I I I I I I I I I I I | I I M | | M | | | M I I I I I I I I I I I I I M I I I 

C T T AC AT C C T T C AG GAC AT GAGCT G C T G GAT GCAGGT C T T CACT C AG C CAAAAT GAGAC A 36000 

CT T GAT AAACAGT GCT GT GC 1056 

II III I I I I 
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RESULT 7 
AC110839/c 

LOCUS AC110839 326606 bp DNA linear HTG ll-OCT-20 

DEFINITION Rattus norvegicus clone CH230-208A12 , *** SEQUENCING IN PROGRESS 

25 unordered pieces. 
ACCESSION AC110839 

VERSION AC110839.4 GI:23820318 

KEYWORDS HTG; HTGS_PHASE1; HTGS_DRAFT; HTGS__ENRICHED. 
SOURCE Rattus norvegicus (Norway rat) 

ORGANISM Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 

Rattus . 

REFERENCE 1 (bases 1 to 326606) 

AUTHORS Muzny, D.Marie. , Metzker , M. Lee . , Abramzon,S., Adams, C, Alder, J., 
Allen, C, Allen, H., Alsbrooks , S . , Amin,A. , Anguiano,D., 
Anyalebechi, V. , Aoyagi,A., Ayodeji,M., Baca,E., Baden, H., 
Baldwin, D., Bandaranaike, D. , Barber, M. , Barnstead,M. , Benahmed, F. 
Biswalo,K., Blair, J., Blankenburg, K. , Blyth,P., Brown, M. , 
Bryant, N., Buhay,C, Burch,P., Burrell,K., Calderon,E., 
Cardenas, V., Carter, K., Cavazos,I., Ceasar,H., Center, A., 
Chacko,J., Chavez , D . , Chen,G., Chen,R., Chen,Y., Chen,Z., Chu,J., 
Cleveland, C. , Cockrell,R., Cox,C, Coyle,M., Cree,A., D'Souza,L., 
Davila,M.L., Davis, C, Davy-Carroll, L. , DeAnda,C, Dederich,D., 
Delgado,0., Denson,S., Deramo,C, Ding,Y., Dinh,H., Divya,K., 
Draper, H., Dugan-Rocha, S . , Dunn, A., Durbin,K., Duval, B., Eaves, K. 
Egan,A. , Escotto,M., Eugene, C, Evans, C. A., Falls, T., Fan,G., 
Fernandez, S. , Finley,M., Flagg,N., Forbes, L., Foster, M. , Foster, P 
Fraser,C.M., Gabisi,A., Ganta,R., Garcia, A. , Garner, T., Garza,M., 
Gebregeorgis,E. , Geer,K., Gill,R., Grady, M. , Guerra,W. , Guevara, W 
Gunaratne, P. , Haaland,W., Hamil,C, Hamilton, C. , Hamilton, K. , 
Harvey, Y. , Havlak,P., Hawes,A., Henderson, N . , Hernandez, J. , 
Hernandez, R. , Hines,S., Hladun,S.L., Hodgson, A., Hogues,M., 
Hollins,B., Howells,S-, Hulyk,S., Hume, J., Idlebird,D., Jackson, A 
Jackson, L., Jacob, L. , Jiang, H., Johnson, B., Johnson, R. , Jolivet,A 
Karpathy,S., Kelly, S . , Kelly, S., Khan, Z . , King,L., Kovar,C, 
Kowis,C, Kraft, C.L., Lebow,H., Levan,J., Lewis,L., Li,Z., Liu, J., 
Liu, J., Liu,W., Liu, Y. , London, P., Longacre,S., Lopez, J., 
Lorensuhewa,L. , Loulseged, H . , Lozado,R.J., Lu,X., Ma, J., 
Maheshwari,M. , Mahindartne, M. , Mahmoud,M., Malloy,K., Mangum,A. , 
Mangum,B., Mapua,P., Martin, K., Martin, R., Martinez, E., 
Mawhiney,S., McLeod,M.P., McNeill, T . Z . , Meenen,E., 
Milosavljevic,A. , Miner, G., Minja,E., Montemayor, J. , Moore, S., 
Morgan, M. , Morris, K., Morris, S., Munidasa,M., Murphy, M. , Nair,L., 
Nankervis,C. , Neal,D., Newton, N., Nguyen, N., Norris,S., 
Nwaokelemeh,0. , Okwuonu,G., Olarnpunsagoon, A. , Pal,S., Parks, K., 
Pasternak, S. , Paul,H., Perez, A., Perez, L., Pf annkoch, C . , 
Plopper,F., Poindexter,A. , Popovic,D., Primus, E., Pu,L.-L., 
Puazo,M., Quiroz,J., Rachlin,E., Reeves, K., Regier,M.A., Reigh,R., 
Reilly,B., Reilly,M., Ren,Y., Reuter,M., Richards, S., Riggs,F., 
Rives, C, Rodkey,T., Rojas,A., Rose,M., Rose,R., Ruiz, S . J. , 
Sanders, W., Savery,G., Scherer,S., Scott, G., Shatsman,S., Shen,H., 
Shetty,J., Shvartsbeyn, A. , Sisson,I., Sitter, CD., Smajs,D., 
Sneed,A., Sodergren, E . , Song,X.-Z., Sorelle,R., Sosa,J., 
Steimle,M., Strong, R. , Sutton, A., Svatek,A., Tabor, P., Taylor, C, 
Taylor, T., Thomas, N., Thomas, S., Tingey,A. , Trejos,Z., Usmani,K., 
Valas,R., Vera,V., Villasana, D . , Waldron,L., Walker, B. , Wang, J., 



Wang,Q., Wang,S., Warren, J . , Warren,R. ; Wei,X., White, F. , 
Williams, G. , Willson,R., Wleczyk,R., Wooden, H., Worley,K., 
Wright, D., Wright, R. , Wu,J., Yakub,S., Yen, J., Yoon,L., Yoon,V., 
Yu,F., Zhang, J., Zhou, J., Zhou,X., Zhao,S., Dunn,D., von 
Niederhausern,A. , Weiss, R. , Smith, D.R., Holt, R. A., Smith, H.O., 
Weinstock,G. and Gibbs,R.A. 
Direct Submission 
Unpublished 

2 (bases 1 to 326606) 
Worley,K.C. 

Direct Submission 

Submitted ( 16-FEB-2002 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

3 (bases 1 to 326606) 

Rat Genome Sequencing Consortium. 
Direct Submission 

Submitted ( ll-OCT-2 002 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

On Oct 11, 2002 this sequence version replaced gi:21739250. 
The sequence in this assembly is a combination of BAC based reads 
and whole genome shotgun sequencing reads assembled using Atlas 
(http://www.hgsc.bcm.tmc.edu/projects/rat/). Each contig described 
in the feature table below represents a scaffold in the Atlas 
assembly (a 1 contig-scaf f old ' ) . Within each contig-scaf fold, 
individual sequence contigs are ordered and oriented, and separated 
by sized gaps filled with Ns to the estimated size. The sequence 
may extend beyond the ends of the clone and there may be sequence 
contigs within a contig-scaf fold that consist entirely of whole 
genome shotgun sequence reads. Both end sequences and whole genome 
shotgun sequence only contigs will be indicated in the feature 
table. 

Genome Center 

Center: Baylor College of Medicine 
Center code: BCM 

Web site: http://www.hgsc.bcm.tmc.edu/ 

Contact : hgsc-help@bcm. tmc.edu 
Project Information 

Center project name: GRKD 

Center clone name: CH230-208A12 
Summary Statistics 

Assembly program: Phrap; version 0.990329 

Consensus quality: 242752 bases at least Q40 

Consensus quality: 250821 bases at least Q30 

Consensus quality: 254983 bases at least Q20 

Estimated insert size: 244968; sum-of-contigs estimation 

Quality coverage: 5x in Q20 bases; sum-of-contigs estimation 



* NOTE: Estimated insert size may differ from sequence length 

* (see http://www.hgsc.bcm.tmc.edu/docs/Genbank_draft_data.html) . 

* NOTE: This is a 'working draft 1 sequence. It currently 

* consists of 25 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 



FEATURES 
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1. . 


326606 



available and the accession number will 

contig of 10356 bp in length 
gap of unknown length 
contig of 5363 bp in length 
gap of unknown length 
contig of 229449 bp in length 
gap of unknown length 
contig of 26573 bp in length 
gap of unknown length 
contig of 4227 bp in length 
gap of unknown length 
contig of 5691 bp in length 
gap of unknown length 
contig of 1173 bp in length 
gap of unknown length 
contig of 1101 bp in length 
gap of unknown length 
contig of 1031 bp in length 
gap of unknown length 
contig of 1218 bp in length 



length 

bp in length 
length 



gap of unknown 
contig of 1217 
gap of unknown 
contig of 1329 bp in length 
gap of unknown length 
contig of 1346 bp in length 
gap of unknown length 
contig of 1644 bp in length 
gap of unknown length 
contig of 1614 bp in length 
gap of unknown length 
contig of 1246 bp in length 
gap of unknown 
contig of 1764 
gap of unknown 
contig of 1770 bp in length 
gap of unknown length 
contig of 1683 
gap of unknown 
contig of 3092 
gap of unknown length 
contig of 1362 bp in length 
gap of unknown length 
contig of 1452 bp in length 
gap of unknown length 
contig of 1553 bp in length 
gap of unknown length 
contig of 4556 bp in length 
gap of unknown length 
contig of 12396 bp in length. 



length 

bp in length 
length 



bp in length 
length 

bp in length 



/ organism="Rattus norvegicus" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 10116" 
/clone= ,, CH230-208A12 n 



misc_feature 1. .1742 

/note="wgs_end_extension 

clone_end : Sp 6 " 
misc__feature complement (4245. . 5082) 

/note="clone_boundary 

clone_end: Sp6 

site : EcoRI 

end_sequence: RWBKN06TVB" 
misc_feature 10457. .12850 

/note="wgs_contig" 
misc_feature 15920. .16991 

/note="wgs_contig" 
misc_feature complement (220129. . 221101) 

/note="clone_boundary 

clone_end: T7 

site ; EcoRI 

end_sequence: RWBKN06TJB" 
misc_feature 241580. .242749 

/note="wgs_end_extension 

clone_end:T7" 
misc_feature 243833. .245368 

/note="wgs_end_extension 

clone_end:T7" 

ORIGIN 

Query Match 39.9%; Score 615.8; DB 2; Length 326606; 

Best Local Similarity 89.0%; Pred. No. 1.2e-126; 

Matches 665; Conservative 0; Mismatches 82; Indels 0; Gaps 0; 

Qy . 46 GGCACAGAATTTATCTTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTA 105 

M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I Ml | | | M | | II I II I 
Db 24232 6 GGC AC AGAAT T TAT CT T GT GAAAAT T GG CT GG CAT T AGAGAAT AT T T T GAAAAAGTACT A 

242267 

Qy 106 CCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGT 165 

I I I I I I I I I I I I I M I I ! i I I I I I I I I I M I I I I I I I I I | | 

Db 2 42266 CCTCTCTGCATTTTATGGGATCGAGTTCATTGTTGGAATGCTTGGCAATTTCACCGTGGT 

242207 

Qy 166 GTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCT 225 

I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I II I I I I I 
Db 242206 GTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGTAGCAACGTCTATCTCTTCAACCT 
242147 

Qy 226 TTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAA 285 

N I I I I I I I I I I I I I II II I I I I I | | | | | I I I I I I I I I I I I I I | I I | | | | MM 
Db 24214 6 TTCCATCTCTGACCTTGCTTTCCTGTGCACGCTTCCCATGCTGATAAGGAGTTACGCCAC 

242087 

Qy 28 6 T GAT AAG G GGAC CT AT G GAGAT GT T C T C T GT ATAAGCAAC C GAT AT GT GCTT C AC AC CAA 34 5 

II II M I I I I I I I II I I II I I I I I || | | | | || I I I I I I I M I I I II I I MM 
Db 242086 TGGGAACT GGAC CTATGGAGATGTTCTCTGCATAAGCAACCGTTATGT GCTT CAT GCCAA 
242027 

Qy 34 6 CCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCATGGACCGATATCTGCTCATGAA 405 

I I I I I II I MM II II I I I I I II M I I II II I M I II I II I I I M II 



Db 242 02 6 C C T C T AC AC C AG CAT CCTTTTCCT C AC T T T CAT TAG C AT AGAC C GAT AT C T GC T CAT GAA 

241967 



4 °6 GT AC C CT T T C C GAGAACAC T T T CT AC AAAAGAAGGAAT T T GC CAT T T TAAT CTCGCTGGC 4 65 

N I M M M M I I I I I I I I I I I I I I I I I II I I I I I I | | | | | | | | | | | | | | | | | | | | | 

Db 241966 GTTCCCTTTCCGAGAACACATTCTACAAAAGAAGGAATTTGCCATTTTAATCTCCCT GGC 

241907 

Qy 4 66 TGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCC 525 

INIMM II I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | || | | | | in in 

Db 241906 TGTCTGGGTCTTAGTGACCTTAGAAGTTCTACCTATGCTCACGTTTATCACTTCCACCCC 

241847 

Qy 52 6 AAAAGAAGAGG GC AGT AACT G CAT C GACTAT G CAAGT T CT GGAAAC C C T GAAC ACAAT CT 585 

H I I I I I I M I I Ml I M I I I I I I I I I I I I I | | | | | | | | | | || Ml Ml 

Db 2418 4 6 AAT AGAAAAG G GCGACAGCTGTGTCGAC TAT G CAAGT TCT GGAAAC CCTAAATACAGTCT 

241787 

Qy 586 CATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTT 645 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || | | M I I I I I I I I I I | II | 

Db 2417 86 CATTTACAGCCTGTGCCTGACTTTGCTGGGCTTCCTCATTCCTCTGTCTGTAATGTGCTT 

241727 

Qy 646 CTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCT 705 

I I I I I I I I I I I I I I I I I I I || I I I I I I II I I I I I I I I I II I I || 

Db 241726 CTTCTACTACAAAATGGTAGTCTTCCTAAAGAAGAGGAGCCAGCAGCAGGCAACTGTGCT 

241667 

Qy 706 GCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTT 7 65 

I HI I I I I I I I I I I I I I I I I I I I I I I | | | || || | || | | || | | | | | | | | | | 

Db 241666 ATCGCTGAACAAACCTCTGCGCCTGGTGGTCCTGGCAGTGGTGATCTTCTCTGTACTCTT 

241607 

Qy 7 66 C AC AC C C TAT CAT AT CAT G C G C AAT T T 7 92 

I I I I I II I II I I I I I I I I I I I I I 
Db 241606 T AC AC CTT AC CAT AT CAT G C GCAAT GT 2 415 80 



RESULT 8 
AF247785 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



AF247785^ 1325 bp mRNA linear PRI 26-MAR-2002 

Homo sapiens P2Y purinoceptor 1 mRNA, complete cds . 

AF247785 

AF247785. 1 GI : 19716154 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 1325) 

Zhang, W., Li, N . , Wan,T. and Cao,X. 
Human P2Y purinoceptor 1 
Unpublished 

2 (bases 1 to 1325) 

Zhang, W., Li, N . , Wan,T. andCao,X. 
Direct Submission 

Submitted (21-MAR-2000 ) Department of Immunology, Second Military 



Medical University & Shanghai Brilliance Biotechnology Institute, 
800 Xiangyin Rd., Shanghai 200433, P.R. China 
FEATURES Location/Qualifiers 
source 1. .1325 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref= f, taxon: 9606" 
CDS 69. .1073 

/ codon__start=l 

/product="P2Y purinoceptor 1" 
/protein_id="AAL95690. 1" 
/db_xref="GI : 19716155" 

/ translation="MLGIMAWNATCKNWLAAEAALEKYYLSIFYGIEFWGVLGNTIV 
VYGYI FSLKNWNS SNI YLFNLS VS DLAFLCTLPMLI RS YANGNWI YGDVLCI SNRYVL 
HANLYTSILFLTFISIDRYLIIKYPFREHLLQKKEFAILISLAIWVLVTLELLPILPL 
INPVITDNGTTCNDFASSGDPNYNLIYSMCLTLLGFLIPLFVMCFFYYKIALFLKQRN 
RQ VAT AL P L E K P LN LVI MAWI F S VL FT P YHVMRNVRI AS RL GS WKQ YQ CT QWI N S F 

YIVTRPLAFLNSVINPVFYFLLGDHFRDMLMNQLRHNFKSLTSFSRWAHELLLSFREK 
it 

ORIGIN 



Query Match 38.4%; Score 592.4; DB 9; Length 1325; 

Best Local Similarity 75.3%; Pred. No. 1.6e-121; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2; 



Ov 


39 


kd^i-*lKjj-u-s.± ij 1 1 1A1 L. X lb 1 QjACjAAI 1 GG1 TGGCAACAGAGGCTATCTTGAATA 

II 1 M 1 1 1 MM 1 M M 1 M M 1 M 1 M M M M M 1 M 1 1 

GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 


98 


Db 


76 


135 


Qy 


99 


AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 

M 1 II M M 1 M M M M 1 1 1 M M M M 1 M 1 1 M M 1 1 M M 

AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 


158 


Db 


136 


195 


Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 

1 M II 1 M 1 M M M M M M M 1 M M 1 M M M M 1 Ml 1 M M 1 1 

T T GT T GT T T AC GGCT AC AT CTTCTCTCT GAAGAACT G GAACAGC AGT AAT AT T TAT C T CT 


218 


Db 


196 


255 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 

1 1 1 II M II M M i M M 1 M M M 1 M M M 1 M I M M M M M 1 

TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


256 


315 


Qy 


279 


AT GC CAAT GATAAG GGGAC CT AT G G AGAT GT T CT CT GTAT AAG C AAC C GAT AT GT G C T T C 

1 1 N M 1 1 1 M HI M M M M M M M 1 M M M M M M M 1 M M 1 M 

AT GC CAAT GGAAAC T GGAT AT AT G GAGAC GT GC T CT GCAT AAGCAAC C GAT AT GT GCT T C 


338 


Db 


316 


375 


Qy 


339 


AC AC CAAC CT CT ACAC C AG CAT CCTCTTCCT C AC TT T CAT TAGCAT G GAC C GAT AT CT GC 

1 1 M 1 1 1 1 M 1 M M II II II M 1 M M 1 M 1 M M M 1 M M 1 M 1 1 

AT G C CAAC CT C TAT AC C AG CAT TCTCTTTCT C AC TT T TAT CAGC AT AGAT C GAT ACT T GA 


398 


Db 


376 


435 


Qy 


399 


T CAT GAAGT AC C CT T T C C GAGAAC ACT TT C T ACAAAAGAAGGAAT T T G C CAT T T T AAT CT 

1 N 1 | | MM 1 M M 1 M M 1 M M M M 1 M M 1 

TAAT T AAGT AT C C T T T C C GAGAAC AC C TT CT G CAAAAGAAAGAGT T T GCT AT T T TAAT CT 


458 


Db 


436 


495 


Qy 


459 


C GCT GGCT GTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCAT GCT CACTTT CAT CAATT 

1 1 1 1 1 1 1 M 1 II 1 II M M M M 1 1 M 1 1 II 1 1 1 1 1 M 1 II 

CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 


518 


Db 


496 


555 



Qy 


519 


C T GT C C C AAAAGAAGAG G G C AGT AACT GC AT C GAC TAT G CAAGT T C T GGAAAC C CT GAAC 

1 1 1 1 M M 1 1 M 1 1 1 1 1 1 II 1 1 1 1 | M 1 1 1 1 II 1 

C T GTT ATAACT GACAAT G G C AC C AC CT GTAAT GAT T T T GCAAGT T C T GGAGAC C C CAAC T 


578 


Db 


556 


615 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

M 1 1 1 1 1 1 1 1 1 II II II lllllll Mill MINIM | mil 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


616 


675 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 
1 1 1 1 1 1 M 1 II 1 II M II 1 1 II II 1 II II 1 | || I I MM II 1 
TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


676 


735 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

1 1 M II II 1 II II M II | 1 || | | M II II II II 1 II 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


736 


795 


Qy 


759 


T AC T C TT C AC AC C CT AT CAT AT CAT GC GCAAT TT GAGGAT C GC CT C AC GC C T GGATAGT T 

1 1 1 1 1 1 1 1 M II II 1 1 1 M 1 II 1 1 II II II 1 II II 1 || || I | | | M MM 

TGCTTTTTACACCCTATCACGT CAT GCGGAATGT GAGGAT CGCTTCACGCCTGGGGAGTT 


818 


Db 


796 


855 


Qy 


819 


G GCCACAAGGATGTACACAGAAGGCCAT CAAAT CTATATACACACT GACACGGCCT C 

1 II 1 lllllll 1 II II 1 1 M 1 1 1 II M II M 1 1 II 1 
GGAAG CAGTAT C AGT G C ACT C AG GT C GT CAT CAACT C CT T T T AC AT T GT GAC AC G GC CT T 


875 


Db 


856 


915 


Qy 


876 


TGGCCTTTCT GAAC AGT G C CAT C AAT C C CAT CT T CT ACT T C CT CAT G GGAGAC CAT T ACA 

M 1 1 1 1 1 1 M M 1 M M 1 1 M 1 II M II II 1 II 1 1 M II 1 II II M 1 II 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


916 


975 


Qy 


936 


GAGAGAT GCT GAT T AGTAAGT T C AGACAAT ACT T CAAGT C C CT T AC AT C C T T CAG GAC AT 

1 II 1 I I I I I I I I I i i i i i i i iiiiiii i i t i i i i j i i i i i i ii i ii 
1 11 MINIM 1 1 1 I Mill lllllll II M M 1 M M II 1 II 1 II 

G GGAC AT GCT GAT GAAT CAAC T GAGAC ACAAC T T CAAAT C C CT T AC AT C C T T TAGCAGAT 


995 


Db 


976 


1035 


Qy 


996 


GAG CT GC T GGAT GC AGGT C T T CACT C AGC CAAAA- T GAGAC ACTT GATAAAC AG 104 8 

MM III 1 II II 1 1 | | || M II M MM II II 1 1 

GGGCTCATGAACTCCTACTTTCATTCAGAGAAAAGTGAGGGGCTTGTGAAACAG 108 9 




Db 


1036 





RESULT 9 
AX549281 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 

FEATURES 

source 



AX549281 1380 bp DNA linear PAT 26-NOV-2002 

Sequence 566 from Patent WO02061087. 

AX549281 

AX54 92 81. 1 GI : 25813951 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Burmer,G.C, Roush,C.L. and Brown, J. P. 

Antigenic peptides, such as for G protein-coupled receptors 
(GPCRs), antibodies thereto, and systems for identifying such 
antigenic peptides 

Patent: WO 02061087-A 566 08-AUG-2002; 
Lifespan Biosciences, Inc. (US) 

Location/ Qualifiers 

1. .1380 



/ organism="Homo sapiens " 
/mol_type="unassigned DNA" 
/db_xref="taxon: 9606" 

ORIGIN 



Query Match 38.4%; Score 592.4; DB 6; Length 1380; 

Best Local Similarity 75.3%; Pred. No. 1.6e-121; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 



Qy 


39 


G CAGAAT GG C ACAGAAT T TAT C T T GT GAGAAT T GGT T GGCAACAGAGG CT AT CT T GAATA 

>l 1 N 1 1 1 1 1 1 1 I | | | | 1 1 1 1 1 1 1 1 1 I I | | | | | | | | IMM 

GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 


98 


Db 


50 


109 


Qy 


99 


AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 

1 1 M 1 M 1 1 1 II 1 II 1 1 1 1 II MINI II 1 I M | Mill III II 

AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 


158 


Db 


110 


169 


Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 

1 1 1 1 1 1 IN 1 1 1 M II | || | | | | || | | | M| | mil I 

TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 


218 


Db 


170 


229 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 

1 N 1 II 1 II 1 1 1 II 1 || || Mill II M 1 1 1 1 1 II 1 II 1 1 1 1 II M 1 M M 1 

TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


230 


289 


Qy 


279 


AT GC CAAT GATAAGGGGAC CT AT GGAGAT GT T C T CT GT ATAAGCAAC C GAT AT GT GCTT C 

Ml M 1 1 1 1 1 1 1 1 I I || 1 1 1 1 II 1 1 1 M II 1 M 1 1 1 M 1 

AT GC CAATGGAAACT GGAT ATAT GGAGAC GT GCT CT GCATAAGCAAC C GATATGT GCTTC 


338 


Db 


290 


349 


Qy 


339 


AC AC CAACC T CT AC AC CAGCAT CCTCTTCCT CACTT T CAT TAG CAT GGAC C GAT AT CT GC 

1 1 1 1 1 1 1 M 1 1 1 M 1 II II II 1 II 1 II 1 II 1 1 II 1 1 1 II II II 

AT GC CAAC C T CT AT AC CAG CAT TCTCTTTCT CAC TT T TAT CAG C AT AGAT C GATACTT GA 


398 


Db 


350 


409 


Qy 


399 


T CAT GAAGT AC C CT T T C C GAGAAC ACT T T C T ACAAAAGAAGGAAT TT GC CAT TT T AAT C T 

1 H UNI IIMIIIMM II IMIMM || Mill IIMMIMI 

T AATT AAGT AT C CT T T C C GAGAAC AC CT T CT GCAAAAGAAAGAGT TT GC T ATTT T AAT CT 


458 


Db 


410 


469 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 

1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II || M 1 II 1 1 1 

CCT T G G C C ATT T GG GT T T T AGTAAC CT T AGAGTT ACT AC C CAT ACTT C C C CT T AT AAAT C 


518 


Db 


470 


529 


Qy 


519 


CTGTCCC7VAAAGAAGAGGGCAGTAArTGrATrnArTaTrrnarT , Tr^r^7\ 7\ Ar^r-r-rri^A * 

1 1 1 1 1 1 1 Mill 1 1 II 1 || | || II 1 1 M M 1 

CT GT T AT AACT GACAAT G G CAC CAC CT GT AAT GATT T T GCAAGTT CT GGAGAC C CCAACT 


578 


Db 


530 


589 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

1 1 1 1 ' 'I'M 1 II II 1 || 1 II 1 1 1 1 M M 1 1 1 1 1 1 1 1 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


590 


649 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 
1 1 1 1 1 M II 1 1 M 1 1 1 M 1 1 1 1 1 II 1 1 1 I I || | || | | | | Ml 
TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


650 


709 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

1111 N N M II 1 1 MM | || minimi 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


710 


769 



Qy 


759 


Db 


770 


Qy 


819 


Db 


830 


Qy 


876 


Db 


890 


Qy 


936 


Db 


950 


Qy 


996 


Db 


1010 



TACT CTT CAC AC C CT AT CAT AT CAT GC GC AAT T T GAGGAT C G C CT C AC GC CT G GAT AGT T 
I II M I I I I I I I I I I I I I I I I I I Ml I I I | I | I I I I I I I | | | | | | | | | | | 
TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 

G GCCACAAGGAT GTACACAGAAGGCCAT CAAAT CTATATACACACT GACAC GGC CT C 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 

TGGCCTTTCT GAAC AGT G C CAT CAAT C C CAT CTT CT ACT T C CT CAT GGGAGAC CAT T AC A 

I I I I I I I I I I I II I I I I I Mill II I I I I I I I I I I M I I I | | | 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 



818 



829 



875 



889 



935 



949 



GAGAGAT GCT GAT T AGTAAGT T CAGACAAT ACTT CAAGT CCCTTACAT CCTT CAGGACAT 995 

I I I II I I I I I I I I I I I I I I I | | | | | | | | | M | | | | | | | | || | | | | | 

GGGAC AT GCT GAT GAAT CAAC T GAGACACAACT T CAAAT C C CT TAC AT CCTT TAG CAGAT 1009 

GAGCT GCT GGAT G C AGGT CTT C ACT CAGC C AAAA- T GAGACACTT GATAAAC AG 1048 
I IN Ml I .'Mill 1 , I I | | | | | | MM | | | | | | 

GGGCTCATGAACTCCTACTTTCATTCAGAGAAAAGTGAGGGGCTTGTGAAACAG 1063 



RESULT 10 

AX780453 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 



FEATURES 

source 



AX780453 1380 bp DNA 

Sequence 2610 from Patent WO03039443. 
AX780453 

AX78 0453. 1 GI: 3269744 7 



linear PAT 14-JUL-2003 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Haferlach,T. , Schoch,C, Kern,W., Kohlmann,A. , Schnittger , S . , 
Dugas,M., Eils,R., Brors,B. and Mergenthaler, S . 
Novel genetic markers for leukemias 
Patent: WO 03039443-A 2610 15-MAY-2003; 
Deutsches Krebsf orschungszentrum (DE) ; 

Ludwig-Maximilian-Universitaet Muenchen (DE) ; Haferlach, Torsten, 
PD Dr. Dr. (DE) ; Schoch, Claudia (DE) ; Kern, Wolfgang (DE) 

Location/Qualifiers 

1. .1380 

/organism="Homo sapiens" 
/mol_type="unas signed DNA" 
/db xref="taxon: 9606" 



ORIGIN 



Query Match 38.4%; Score 592.4; DB 6 

Best Local Similarity 75.3%; Pred. No. 1.6e-121 
Matches 764; Conservative 0; Mismatches 246 



Length 1380; 

Indels 4; Gaps 2; 



Qy 

Db 

Qy 



39 



50 



99 



GC AGAAT GGC AC AGAAT T TAT CTT GT GAGAATT GGT T GG CAAC AGAGGCT AT CTT GAAT A 

II I I M II I II I Mill I II M | M | I | | || | M || | | | | | 

GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 



98 



109 



AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 15 E 
I I I I M I I I I I I M M I M I I II II II II I II I I I || M | II II 



110 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 169 

159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCT^ATGTCTATCTTT 218 

Ml I I I I I I I I I I I I I I I I | | | | I | I M I M I I I M I I III I Mill I 
170 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 229 

219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I M f I II I I I I I I I I I I I I I I I | I I I I M || | | | | | | | | | M I I I I I I I I 

230 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 28 9 

279 AT GC CAAT GAT AAGGG GAC CT AT GGAG AT GT T C T CT GT ATAAGCAAC C GAT AT GT GCT T C 338 

I I I I I I ! | || | | | I | | | | | | | || IMM | || I I I II I I I I || II I I I I I I 
290 AT GC CAAT GGAAACT G GAT AT AT GGAGAC GT GC T CT GC ATAAGCAAC C GAT AT GT G C T T C 349 

339 AC AC CAAC CT CT ACAC C AGC AT CCTCTTCCT CACT T T CAT T AGC AT G GAC C GAT AT CT GC 398 

I I M I I II I I I I M II I I I IMM I II II II I II Mill M I I I I I M 

350 AT GC CAAC C T C TAT AC C AGC AT TCTCTTTCT C ACT T T TAT C AG C AT AGAT CGAT ACT T GA 4 09 
399 T CAT GAAGT AC C CT TT C C GAGAAC ACT TT CT ACAAAAGAAGGAAT T T G C C AT TT TAAT CT 458 

I N II I I I II I II I I I I II I I I I MM M II Mill M I II I I II I 

410 TAAT TAAGT AT C C TT T C C GAGAAC ACCTT CT G CAAAAGAAAGAGT T T G CT AT TT T AAT CT 469 

459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I INI Mill II II I II I II I || I I II I I I II M I I I I II I 
47 0 C C TT GGC C AT TTGGGTTT T AGTAAC C T T AGAGT TAC T AC C CAT AC T T C C C CT T ATAAATC 529 

519 C T GT CC CAAAAGAAGAG GGCAGTAACT G CAT C GAC TAT G CAAGTT C T G GAAAC C CT GAAC 578 

I I I I I I I M I I I M I I I || I II I M M I 

530 C T GT T ATAACT GACAAT GGC AC C AC CT GTAAT GAT TT T GCAAGT T CT G GAGAC C C CAACT 589 

579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I I I II I I I I I I M M M I I I I I I I Mill II II I II I I IMM 

590 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 649 

639 T GT G C T T CTT CT AC T ACAAGAT GGT AGT C TT CT TAAAGAG GAGGAGC C AG C AGCAAGCAA 698 

I I I I I I I I I I I I M I M II I II I I I I II II I II I I Ml! Ml 

650 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 709 
699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

UN UNI I I M M II I I MM || MM II M IMMMM 

710 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 7 69 

759 TAC T C T T C AC AC C CT AT CAT AT CAT GC G CAAT T T GAG GAT C GC CT CAC GC CT GGATAGT T 818 

I H II I I I M I I M II I I I M I I III II I I II I II I || I M I II II II II 
77 0 TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 829 

819 G G C C ACAAGGAT GT AC AC AGAAGG C CAT CAAAT CT AT AT ACAC ACT GAC AC G GC CT C 875 

I 1 I I I I I II I IMMMM Mill I II I M I I I II 

830 GGAAG CAGT AT CAGT GCACT C AGGT C GT CAT CAAC T C C T T T T ACATT GT GAC AC G GC C T T 889 

876 TGGCCTTTCT GAAC AGT GC CAT CAAT C C CAT CTT CTAC T T C C T CAT GGGAGAC CAT TAC A 935 

I I I I I I I I M I I I I I II I I I I Ml I || I I II I II I I I 

8 90 TGGCCTTTCT GAACAGTGTCATCAACCCTGT CTT CTATTTTCTTTTGGGAGATCACTTCA 949 

936 GAGAGAT GCT GAT T AGT AAGTT C AGACAAT ACT T CAAGT C C CT TAC AT C CT T CAGGAC AT 995 

I I I I I I I I I I I M MM MUM II I II 

950 G G GAC AT GCT GAT GAAT CAAC T G AG AC AC AAC T T CAAAT C C C T TAC AT C CT T TAG C AGAT 1009 



Qy 996 GAGCTGCTGGATGCAGGTCTTCACTCAGCCAAAA-TGAGACACTTGATAAACAG 104 8 

nnin 1 111 11 I I HI I MM MM MMM 

Db 1010 GGGCTCATGAACTCCTACTTTCATTCAGAGAAAAGTGAGGGGCTTGTGAAACAG 1063 



RESULT 11 

AF348078 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



gene 
CDS 



mRNA, complete 



AF348078 1380 bp mRNA linear PRI 03-APR-2001 

Homo sapiens G-protein coupled receptor 91 (GPR91) 
cds . 

AF348078 

AF348 078. 1 GI: 13517982 



Craniata; Vertebrata; Euteleostomi; 
Catarrhini; Hominidae; Homo. 



ORIGIN 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (bases 1 to 1380) 
Wittenberger,T. , Schaller, H . C . and Hellebrand, S . 

An expressed sequence tag (EST) data mining strategy succeeding in 

the discovery of new G-protein coupled receptors 

J. Mol. Biol. 307 (3), 799-813 (2001) 

21172992 

11273702 

2 (bases 1 to 1380) 

Wittenberger,T. , Schaller, C . H . and Hellebrand, S . 
Direct Submission 

Submitted (08-FEB-2001) ZMNH, Institut fur 

Entwicklungsneurobiologie, Martinistr. 52, Hamburg 20246, Germany 
Location/Qualifiers 
1. .1380 

/ organism="Homo sapiens" 
/mol_t ype= "mRNA" 
/db__xref="taxon: 9606" 
/ chromosome="3" 
/map="3q24-q25. 1" 
1. .1380 
/gene="GPR91" 
55. .1047 
/gene="GPR91" 
/note="orphan receptor" 
/ codon_start=l 

/product="G-protein coupled receptor 91" 
/protein_id="AAK29080 . 1" 
/db_xref="GI : 13517983" 

/translation="MAWNATCKNWLAAEAALEKYYLSIFYGIEFVVGVLGNTIVVYGY 
IFSLKNWNSSNIYLFNLSVSDLAFLCTLPMLIRSYANGNWIYGDVLCISNRYVLHANL 
YTSILFLTFISIDRYLIIKYPFREHLLQKKEFAILISLAIWVLVTLELLPILPLINPV 
ITDNGTTCNDFASSGDPNYNLIYSMCLTLLGFLIPLFVMCFFYYKIALFLKQRNRQVA 
TALPLEKP LNLVIMAWI FS VL FT P YHVMRNVRI AS RLGSWKQ YQCTQWI NS FYI VT 
RPLAFLNSVINPVFYFLLGDHFRDMLMNQLRHNFKSLTSFSRWAHELLLSFREK" 



Query Match 38.4%; Score 592.4; DB 9; 

Best Local Similarity 75.3%; Pred. No. 1.6e-121; 
Matches 764; Conservative 0; Mismatches 246; 



Length 1380; 
Indels 4; Gaps 



2; 



39 G C AGAAT G G C AC AGAAT T TAT C T T GT G AGAAT T G GT T G G C AAC AGAG G C TAT CT T GAAT A 98 

I I I I I I I I I II I I I | I I I I I I II I I | | | | | | | | | M Mill 
5 0 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 109 

99 AGT ACT AC CT C T C T GC ATT T TAT GCAAT C GAGT T CAT T T T T GGACT G C T T G G GAAT GT C A 158 
I I I I M I I I I I I I I I I I I I I I | || | | | || | | | | | | | | | | || | || 

110 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 169 

159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

HI M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M IN I I I I I I I 
170 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 229 

219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

Nil, II I I I I I M I I I I I I I I I I I I I I || | | | Mill I I I I I I I I I I I I 

23 0 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 289 

27 9 AT GC CAAT GAT AAG GGGAC CT AT GGAGAT GT T C T CT GT AT AAG CAAC C GAT AT GT G C T T C 338 

I'll I I Ml M II I I I I I I I I I | | | | M | | | | | | | | | | | || | M I I 

29 0 AT GC CAAT GGAAACT GGATAT AT GGAGAC GT GC T CT GC ATAAGCAAC C GAT AT GT GC T T C 34 9 

339 ACAC CAAC CT C T AC AC C AGCAT CCTCTTCCT CACT T T CAT T AGCAT G GAC C GAT AT CT GC 3 98 

I I N I I M II I I I II M I I I I I I I II I I I I I I I I M || I I | | | | M II 
350 AT G C CAAC CT CT AT AC CAGCAT T CT CT TT CT CACTT T TAT C AG CAT AGAT C GAT AC TT GA 409 

399 T CAT GAAGT AC C CT T T C C GAGAAC ACT T T CTAC AAAAGAAG GAAT T T GC C AT T T TAAT CT 458 

I II > I N I I I I I II I I II I I | | | | | | | | || | | | | | MINIMI! 

410 TAAT TAAGT AT C CT T T C C GAGAAC AC CT T CT GCAAAAGAAAGAGT T T GCT AT T T TAAT CT 4 69 

459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I I I I I I I I I I I I I I II I M I I I I I I | || I I I I I I I I I 
470 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 529 

519 C T GT C C CAAAAGAAGAGGGCAGT AAC T GC AT C GAC T AT GCAAGT T CT GGAAAC C CT GAAC 578 

I I I I II II I I M I M || | | | | | | | || I I I 

530 C T GT T ATAACT GACAAT GGCAC C AC CT GT AAT G AT TT TGCAAGT T CT GGAGAC C C CAACT 589 

579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

'III IIIMIIIIMI I II M II lllllll Mill MINIM I INN 

590 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 649 

639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 

I I I I I II I I I I I I N I II I I I || I I I I I I I MMI II I I III 

650 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 709 

699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

IN I I I II I I II I I I I I I I I I I || | | || IN 

710 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 769 

759 T AC T CT T C ACAC C C TAT CAT AT CAT GC G CAAT TT GAGGAT C GC CT C AC GC C T GGAT AGT T 818 

I N I I I I M I I M I I I I I I I II II I II II I I II I NN 

77 0 TGCTTTTTACACCCTATCACGT CAT GCGGAATGT GAGGAT CGCTTCACGCCTGGGGAGTT 829 

819 G GC CACAAGGAT GT ACACAGAAGGC CAT CAAAT CTAT ATACACACT GACAC GGCCT C 875 

0 ' 11 1 ' ' I Mllll II I III MINN 

83 0 GGAAGCAGTATCAGTGCACTCAGGTCGT CAT CAACT CCTTTTACATTGTGACACGGCCTT 88 9 



Qy 


876 


T GG C CT T T C T GAAC AGT GC CAT C AAT CC C AT CT T C T AC T T C C T CAT GGG AGAC CAT T AC A 

N ! II 1 1 1 1 1 1 1 1 1 1 || | | | | | | | || I I I | | | | || || | | | | | | | M 1 II 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


890 


949 


Qy 


936 


GAGAGAT GCT GAT T AGTAAGT T CAGACAAT AC T T C AAGT C C CT TAC AT C CT T C AG GAC AT 

1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 I II i ii 
1 ! i i i i i i i i i i i i i i i | | | | | [ | | i |( i | || | |j 

GGGAC AT GCT GAT GAAT C AAC T GAGACAC AACT T CAAAT C C C T TAC AT C C T T T AGCAGAT 


995 


Db 


950 


1009 


Qy 


996 


GAGCT GC T GGAT G C AGGT C T T C AC T C AGCCAAAA- T GAGAC AC T T GATAAAC AG 1048 

MM Ill 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MM M 1 M 1 

G GG CT CAT GAAC T C C TAC T T T CAT T CAGAGAAAAGT GAG GG GCT T GT GAAAC AG 1063 




Db 


1010 





RESULT 12 

BC030948 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 



BC030948 1449 bp mRNA linear PRI 12-NOV-2003 

Homo sapiens G protein-coupled receptor 91, mRNA (cDNA clone 
MGC: 32514 IMAGE : 4594 810 ) , complete cds . 
BC030948 

BC03094 8. 1 GI : 21410927 
MGC. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1449) 

Strausberg,R.L. , Feingold, E . A. , Grouse, L.H., Derge,J.G., 
Klausner,R.D. , Collins, F. S - , Wagner, L . , Shenmen, C .M. , Schuler, G. D . , 
Altschul,S.F., Zeeberg,B., Buetow,K.H., Schaef er , C . F. , Bhat,N.K., 
Hopkins, R. F. , Jordan, H., Moore, T., Max,S.I., Wang, J., Hsieh,F., 
Diatchenko,L. , Marusina,K., Farmer, A. A. , Rubin, G.M., Hong,L., 
Stapleton,M. , Soares,M.B., Bonaldo,M. F. , Casavant , T . L . , 
Scheetz,T.E. , Browns tein, M. J. , Usdin,T.B. , Toshiyuki, S. , 
Carninci,P., Prange,C, Raha,S.S., Loquellano, N . A. , Peters, G. J., 
Abramson,R.D. , Mullahy, S . J. , Bosak,S.A., McEwan,P.J., 
McKernan, K. J. , Malek,J.A., Gunaratne, P .H. , Richards, S., 
Worley,K.C, Hale,S., Garcia, A.M., Gay,L.J., Hulyk,S.W., 
Villalon,D.K. , Muzny, D.M. , Sodergren, E . J. , Lu,X., Gibbs,R.A., 
Fahey,J., Helton, E., Ketteman,M., Madan,A. , Rodrigues, S . , 
Sanchez, A., Whiting, M. , Madan,A., Young, A. C . , Shevchenko, Y. , 
Bouf f ard, G. G. , Blake s ley, R. W. , Touchman, J. W. , Green, E.D. , 
Dickson, M.C. , Rodriguez, A. C. , Grimwood,J., Schmutz,J., Myers, R.M., 
Butterfield,Y.S., Krzywinski,M. I . , Skalska,U., Smailus , D. E . , 
Schnerch,A., Schein,J.E., Jones, S.J. and Marra,M.A. 
Generation and initial analysis of more than 15,000 full-length 
human and mouse cDNA sequences 

Proc. Natl. Acad. Sci. U.S.A. 99 (26), 16899-16903 (2002) 

22388257 

12477932 

2 (bases 1 to 1449) 
Strausberg, R. 
Direct Submission 

Submitted (03-JUN-2002) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 



COMMENT Contact: MGC help desk 

Email : cgapbs-r@mail . nih . gov 
Tissue Procurement: CLONTECH 

cDNA Library Preparation: CLONTECH Laboratories, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Sequencing Group at the Stanford Human Genome 

Center, Stanford University School of Medicine, Stanford, CA 94305 

Web site: http://www-shgc.stanford.edu 

Contact: (Dickson, Mark) mcd@paxil.stanford.edu 

Dickson, M., Schmutz, J., Grimwood, J., Rodriquez, A., and Myers, 
R. M. 



FEATURES 

source 



gene 



CDS 



misc feature 



ORIGIN 



Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consortium/LLNL at: http://image.llnl.gov 
Series: IRAL Plate: 41 Row: e Column: 17 

This clone was selected for full length sequencing because it 
passed the following selection criteria: matched mRNA gi : 14780893. 

Location/ Qualifiers 

1. .1449 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon: 9606" 

/ clone="MGC: 32514 IMAGE : 4594 810" 

/tissue_type="Kidney" 

/clone_lib="NIH_MGC_75" 

/lab_host="DH10B" 

/note="Vector: pDNR-LIB" 

1. .1449 

/gene="GPR91" 

/db_xref="LocusID: 56670" 

/db_xref="MIM: 606381" 

112. .1104 

/ codon_start=l 

/product="G protein-coupled receptor 91" 
/protein_id="AAH30948 . 2" 
/db_xref="GI: 37572249" 
/ db_xr e f = " Lo cus I D : 5 6 6 7 0 " 

/translation="MAWNATCKNWLAAEAALEKYYLSIFYGIEFWGVLGNTIVVYGY 

IFSLKNWNSSNIYLFNLSVSDLAFLCTLPMLIRSYANGNWIYGDVLCISNRYVLHANL 

YTSILFLTFISIDRYLIIKYPFREHLLQKKEFAILISLAIWVLVTLELLPILPLINPV 

ITDNGTTCNDFASSGDPNYNLIYSMCLTLLGFLIPLFVMCFFYYKIALFLKQRNRQVA 

TALPLEKPLNLVIMAWIFSVLFTPYHVMRNVRIASRLGSWKQYQCTQWINSFYIVT 

RPLAFLNSVINPVFYFLLGDHFRDMLMNQLRHNFKSLTSFSRWAHELLLSFREK" 
217. .984 

/note="7tm_l; Region: 7 transmembrane receptor (rhodopsin 
family) " 

/db_xref="CDD:pfam00001" 



Query Match 38.4%; 
Best Local Similarity 75.3%; 
Matches 764; Conservative 



Score 592.4; DB 9 
Pred. No. 1.6e-121 
0; Mismatches 246 



Length 1449; 
Indels 4; 



Gaps 



Qy 



Db 



39 



107 



2; 



G C AGAAT GG CAC AGAAT T TAT CT T GT GAGAATT GGT T GG CAACAGAGG C TAT C T T GAAT A 9 8 

11 1 1 I I I I I I I I I I I I I I I I I I I | | | Mill I I I I I 

GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 166 



9 9 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I I I I I I I I I II I I I I I I I I I I I I I M I I I | | | | | || 

167 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 226 

159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

' I ' II I I M I M I I I I I I I I I II I I II I I I I II I I I I I III I I I I i I I 

227 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 286 

219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

' I M I I I II I I I M I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I I I I 
28 7 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 34 6 

279 ATGCCAATGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTC 338 

1 I I I I I I I I I I I I I M I I I I I II I I I I I I I M | | | | | || | 

347 ATGCCAATGGAAACTGGATATATGGAGACGTGCTCTGCATAAGCAACCGATATGTGCTTC 406 

339 ACACCAACCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCATGGACCGATATCTGC 398 

I I I I I I M I I I I II I I I I I I I I I I M I I I I I I II I I | | | || | | | | | | | 
407 ATGCCAACCTCTATACCAGCATTCTCTTTCTCACTTTTATCAGCATAGATCGATACTTGA 466 

399 TCATGAAGTACCCTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCT 458 

I H I I I I I M I I I I I I I M I I I I IN | || | | | | | || | | I I I | | | 

467 TAATTAAGTATCCTTTCCGAGAACACCTTCTGCAAAAGAAAGAGTTTGCTATTTTAATCT 526 

459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | I II 

527 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 58 6 

519 CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 578 

'HI H N I I'll I I I I I I I I I I I I I I II I I I I I I I I I I 
587 CTGTTATAACTGACAATGGCACCACCTGTAATGATTTTGCAAGTTCTGGAGACCCCAACT 646 

579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I I I I I M I II I II I I I II II M II I | | | | | | | | I I II I I I I I I 

647 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 706 

639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 

I I I I I I I I I I I I II I I I I I | | | | | | 

707 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 766 

699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I I I I I I' N II II I I I I II II II I I I I I I I 

767 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 826 

759 TACTCTTCACACCCTATCATATCATGCGCAATTTGAGGATCGCCTCACGCCTGGATAGTT 818 

I H ' H I I I I I I I I I I I I I II I III I I I I I I I I I I I I I I I 

827 TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 8 86 

819 G— GCCACAAGGATGTACACAGAAGGCCATCAAATCTATATACACACTGACACGGCCTC 875 

I II I I I M I I I I I I II I I I I I I I I | I I I I I I | | | I I 

887 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 946 

876 TGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGGAGACCATTACA 935 

I I I I I I I I I I I I I II II | || | M 

947 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 1006 

936 GAGAGATGCTGATTAGTAAGTTCAGACAATACTTCAAGTCCCTTACATCCTTCAGGACAT 995 



Db 1007 G GGACAT G C T GAT GAAT CAAC T GAGAC ACAACT T CAAAT C C CT TAC AT C CT TT AGCAGAT 1066 

QY 996 GAG CT G CT G GAT G C AG GT CT T C ACT C AGC CAAAA- T GAGACAC T T GATAAAC AG 104 8 

INI III I I I I I | || | MM MM I I I I || I II I 

Db 1067 G GGCT CAT GAAC T C C TAC T T T CAT T C AGAGAAAAGT GAGG G G C T T GT GAAAC AG 1120 



RESULT 13 

AX342665 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



JOURNAL 

FEATURES 

source 



AX342665 1542 bp DNA 

Sequence 20 from Patent WO0198351. 
AX342665 

AX342665. 1 GI: 18152045 



linear PAT 12-JAN-2002 



ORIGIN 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Lai, P., Baughn,M.R., Hafalia, A. J. , Nguyen, D.B., Gandhi, A. R., 
Kallick,D.A., Griffin, J. A., Yue, H . , Khan, F. A., Patterson, C . , 
Lu,D.A., Tribouley,C.M., Lu,Y., Walia,N.K., Graul,R., Yao, M. G. , 
Yang, J., Ramkumar,J., Au-Young,J., Hernandez, R. , Walsh, R. T . and 
Borowsky,M. L. 

Patent: WO 0198351-A 20 27-DEC-2001; 
Incyte Genomics, Inc. (US) 

Location/Qualifiers 

1. .1542 

/organism="Homo sapiens" 
/mol_type="unassigned DNA" 
/db_xref="taxon: 9606" 
/note="Incyte ID No: 3485895CB1" 



Query Match 38.4%; Score 592.4; DB 6; Length 1542; 

Best Local Similarity 75.3%; Pred. No. 1.6e-121; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 



Qy 


39 


G C AGAAT GGCAC AGAAT TT AT C T T GT GAGAAT T G GTT GG CAAC AGAGGCT AT CT T GAAT A 

II MINI 1 1 1 1 1 I I I | 1 1 1 1 1 1 1 1 1 I | | M 1 1 1 1 1 1 1 1 1 | 

GGAT CAT GGCATGGAATGCAACTTGCAAAAACT GGCT GGCAGCAGAGGCTGCCCTGGAAA 


98 


Db 


205 


264 


Qy 


99 


AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 
M 1 1 ' 1 1 1 , 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 | | | | || 
AGT ACT AC CT T T C CAT T T T T TAT G GGAT T GAGT T C GT T GT G G GAGT C C T T GGAAAT AC C A 


158 


Db 


265 


324 


Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 
HI II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | || | | | | | | | | | | | M Ml | M M | 
T T GT T GT T TAC G G CT ACAT CTTCTCTCT GAAGAACT GGAAC AGC AGT AAT AT T TAT CT C T 


218 


Db 


325 


384 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 

1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 IMM 1 M 1 1 1 1 Mill 

TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


385 


444 



Qy 



279 ATGCCAATGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTC 338 
1 1 1 I HI I I M I I I I I I I I I I 



Db 


445 


' AT GC CAAT GGAAAC T G GAT AT AT G GAGAC GT G C T C T GC AT AAGCAAC C GAT AT GT GC TT C 


504 


Qy 


339 


AC AC C AAC C T CT AC AC C AGCAT CCTCTTCCT C ACT T T CAT TAG CAT G GAC C GAT AT C T G C 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II inn || | | | | | || 

AT GC CAAC C T CT AT AC C AGCAT TCTCTTTCT C ACT T T TAT C AGCAT AGAT C GAT AC T T GA 


398 


Db 


505 


564 


Qy 


399 


T CAT GAAGT AC C C T T T C C GAG AAC AC T T T C T AC AAAAGAAGGAAT T T G C CAT T T T AAT C T 

1 II NIM 1 1 1 1 1 1 1 1 II 1 1 1 | | | | | | | | | | | | | | || | | | | | | | | m | | m 1 

TAATTAAGTATCCTTTCCGAGAACACCTTCTGCAAAAGAAAGAGTTTGCTATTTTAATCT 


458 


Db 


565 


624 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 
1 1 1 > 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 I || | | | | || | 
C C T T GG C CAT T T GGGT T TTAGTAAC CT T AGAGT T AC T AC C CAT ACTT C C C CT T AT AAAT C 


518 


Db 


625 


684 


Qy 


519 


C T GTCCCAAAAGAAGAGGGCAGTAACTG CAT CGACTATGCAAGTTCT GGAAAC CCTG AAC 

1 1 1 1 N II 1 1 1 1 | I I | | | | | | | | | | | | | | | || | | | | , | | 

CT GT TAT AAC T GACAAT GGCAC C AC CT GTAAT GAT T T T G C AAGT T C T G GAGACC C CAAC T 


578 


Db 


685 


744 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

1 1 1 1 M 1 1 1 1 1 1 1 I I I || | | | 1 1 1 1 1 1 1 1 1 1 I I I M 1 1 1 1 1 1 1 I 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


745 


804 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 

N M 1 1 1 1 1 1 1 1 1 1 | | | 1 II II 1 II 1 1 II 1 1 1 1 1 Ml 

TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


805 


864 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

1 1 1 1 1 1 1 1 1 1 | | | | | 1 1 1 1 1 I I | M 1 1 1 1 1 1 I I I 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


865 


924 


Qy 


759 


TACT CT T CAC AC C C TAT CAT AT CAT G C GCAAT T T GAGGAT C GC C T C AC GC CT G GAT AGTT 

1 II II 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 I I | | | M | | | | | | | | | | | M 1 1 1 1 1 1 1 

T GCT T T TT ACAC C CT AT CAC GT CAT GC G GAAT GT GAG GAT C GC TT CAC GC C T GGG GAGT T 


818 


Db 


925 


984 


Qy 


819 


G G C C ACAAGGAT GT AC ACAGAAGGC CAT CAAAT C T AT AT ACACAC T GAC AC GGC C T C 
1 'I 1 1 1 1 1 1 1 1 1 II 1 1 1 1 I I | | | | | | | | M 1 II 1 1 1 
GGAAGC AGT AT C AGT GCACT CAGGT C GT CAT CAACT C CT TT TACAT T GT GAC AC GGC C TT 


875 


Db 


985 


1044 


Qy 


876 


TGGCCTTTCT GAAC AGT GC CAT CAAT C C CAT CT T CT ACT T C CT CAT G GGAGAC CAT T AC A 
1 N 1 1 1 1 1 1 1 1 1 1 1 I M 1 1 1 II 1 1 1 1 1 1 II II 1 II || 1 1 I I I I I I | 1 1 1 
TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


1045 


1104 


Qy 


936 


GAGAGAT GCT GATT AGTAAGTTCAGACAATACTT CAAGT CC CTTACAT CCTT CAGGAC AT 

1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 i 1 1 M I! 1 1 II 1 II 
111 1 1 1 1 i i i i i i 1 I I I I I I | | | | | | | (1 || || || | || 

GGGACAT GCT GAT GAAT CAACT GAGACACAACT T CAAAT C C CT TACAT C CTT T AGCAGAT 


995 


Db 


1105 


1164 


Qy 


996 


GAG CT GC T G GAT GCAG GT CT T C ACT CAGC CAAAA- T GAGAC ACT T GAT AAAC AG 104 8 

1 1 1 1 HI 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 MM 1 1 1 1 1 1 

G GGCT CAT GAACT C CT ACT T T CAT T CAGAGAAAAGT GAG G G G CT T GT GAAAC AG 1218 




Db 


1165 





RESULT 14 

AC116026 

LOCUS 

DEFINITION 
ACCESSION 



AC116026 90343 bp DNA linear PRI 09-APR-2002 

Homo sapiens 3 BAC RP11-3F11 (Roswell Park Cancer Institute Human 
BAC Library) complete sequence. 
AC116026 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 



AC116026. 1 GI: 19697319 
HTG. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 90343) 

Muzny,D.M., Adams, C, Adio-Oduola, B . , Ali-osman, F. R. , Allen, C, 
Alsbrooks,S.L., Amaratunge, H . C . , Are,J.R., Ayele,M. , Banks, T., 
Barbaria,J., Benton, J., Bimage,K., Blankenburg, K. , Bonnin,D., 
Bouck,J., Bowie, S., Brieva,M., Brown, E., Brown, M., Bryant, N. P., 
Buhay,C, Burch,P., Burkett,C, Burrell , K . L . , Byrd,N.C, 
Carron,T.F., Carter, M. , Cavazos , S . R. , Chacko,J., Chavez, D., 
Chen,G., Chen,R., Chen,Z., Chowdhry,I., Christopoulos , C . , 
Cleveland, C. D. , Cox,C, Coyle,M.D., Dathorne, S . R. , David, R. , 
Davila,M.L. , Davis, C. , Davy-Carroll, L . , Dederich, D . A. , 
Delaney,K.R. , Delgado,0., Denn,A.L., Ding,Y., Dinh,H.H., 
Douthwaite, K. J. , Draper, H., Dugan-Rocha, S . , Durbin,K.J., 
Earnhart,C, Edgar, D., Edwards , C . C . , Elhaj,C, Escotto,M. , 
Falls, T., Ferraguto,D., Flagg,N., Ford, J., Foster, P., Frantz,P., 
Gabisi,A., Gao,J., Garcia, A., Garner, T., Garza, N., Gill,R., 
Gorrell, J.H., Guevara, W., Gunaratne, P . , Hale,S., Hamilton, K., 
Harris, C, Harris, K. , Hart,M., Havlak,P., Hawes,A., He,X., 
Hernandez, J., Hernandez, O . , Hodgson, A. , Hogues,M., Holloway,C, 
Hollins,B., Homsi,F., Howard, S., Huber,J., Hulyk,S., Hume, J., 
Jackson, L.E., Jacobson,B., Jia,Y., Johnson, R. , Jolivet,S., 
Joudah,S., Karlsson,E., Kelly,S., Khan,U., King,L., Korvah,J., 
Kovar,C, Kratovic,J., Kureshi,A., Landry, N., Leal,B., Lewis, L.C., 
Lewis, L., Li, J., Li,Z., Lichtarge, O . , Lieu,C, Liu, J., Liu,W., 
Loulseged,H. , Lozado,R.J., Lu,X., Lucier,A., Lucier,R., Luna,R., 
Ma, J., Maheshwari,M. , Mapua,P., Martin, R. , Martindale, A. , 
Martinez, E., Massey,E., Mawhiney,E., McLeod,M.P., Meador,M., 
Mei,G., Metzker,M., Miner, G., Miner, Z., Mitchell, T., Mohabbat,K., 
Moore, S., Morgan, M. , Moorish, T., Morris, S., Moser,M. , Neal,D., 
Nelson, D., Newtson,J., Newtson,N., Nguyen, A. , Nguyen, N., Nguyen, N., 
Nickerson,E. , Nwokenkwo, S . , Oguh,M., Okwuonu,G., Oragunye,N., 
Oviedo,R., Pace, A. , Payton,B., Peery,J., Perez, L., Peters, L., 
Pickens, R. , Primus, E., Pu,L.L., Quiles,M., Ren,Y., Rives, M., 
Rojas,A., Rojubokan,I., Rolfe,M., Ruiz,S., Savery,G., Scherer,S., 
Scott, G., Shen,H., Shooshtari , N . , Sisson,I., Sodergren, E . , 
Sonaike,T., Sparks, A., Stanley, H., Stone, H., Sutton, A., Svatek,A., 
Tabor, P., Tamerisa,A., Tamerisa,K., Tang,H., Tansey,J., Taylor, C, 
Taylor, T., Telfrod,B., Thomas, N., Thomas, S., Usmani,K., Vasquez,L., 
Vera, V., Villalon,D., Vinson, R. , Wang,Q., Wang,S., Ward-Moore, S . , 
Warren, R. , Washington, C . , Watlington, S . , Williams, G., 
Williamson, A. , Wleczyk,R., Wooden, S., Worley,K., Wu,C, Wu,Y., 
Wu,Y.F., Zhou, J. , Zorrilla,S., Naylor,S.L., Weinstock,G. and 
Gibbs, R. 

Direct Submission 
Unpublished 

2 (bases 1 to 90343) 
Worley,K.C. 

Direct Submission 

Submitted (23-MAR-2002 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

3 (bases 1 to 90343) 



AUTHORS 

TITLE 

JOURNAL 



COMMENT 



FEATURES 

source 



repeat_region 
repeat_region 
repeat_region 
repeat region 



Worley, K. C. 
Direct Submission 

Submitted ( 09-APR-2002 ) Human Genome Sequencing Center, Department 

of Molecular and Human Genetics, Baylor College of Medicine, One 

Baylor Plaza, Houston, TX 77030, USA 

INFORMATION: http://www.hgsc.bcm.tmc.edu/ or email 

gc-help@bcm. tmc. edu 

CLONE LENGTH: This sequence does not necessarily represent the 
entire insert of this clone. Overlapping regions of clones are only 
sequenced and submitted once, so the sequence for the remainder of 
the insert may be found in the record for the adjacent clones. 
Overlapping clones are noted at the beginning and end of the 
Features listing. 

ANNOTATION OF FEATURES: 

STSs are identified using ePCR (Genome Res. 7:541-550) searches 
of a local database that includes entries from dbSTS, GDB, and 
local mapping efforts. 

Repeats are identified using RepeatMasker (A. Smit and P. Green, 
unpublished.) for Human and Mouse sequences. 

Genes and Region of sequence similarity are identified by BLAST 
(Nuc. Acids Res. 25:3389-34 02) similarity (expect < le-34) to the 
EST and cDNA sequences. Genes demonstrate at least two exons 
flanked by consensus splice sites that maintained sequence 
continuity across the splice junctions. Sequences that are not 
identical matches are annotated as similar. 

SEQUENCING READ COVERAGE : Sequencing is completed to a minimum 
standard of double strand coverage with a minimum of 2 clones and 2 
reads with no ambiguities or 2 chemistries with a minimum of 2 
clones and 3 reads with no ambiguities. If the sequence quality for 
a region does not meet this standard, it will be indicated in the 
annotation as Low Coverage. 

QUALITY OF INDIVIDUAL BASES: This sequence meets stringent quality 
standards - estimated error rate less than 1 per 10,000 bases. 
Reports of lowest quality individual bases and measures of base 
quality are listed below. Description of the metrics can be found 
at URL: 

http : //gc . bcm. tmc . edu : 8 0 88/quality . inf o/genbank . annotation . html . 

QUALSTAT-REPORT . 

Location/Qualif iers 
1. .90343 

/organisrn="Homo sapiens" 
/mol__type=" genomic DNA" 
/db_xref= f, taxon: 9606" 
/ chromosome="3" 
/clone="RPll-3Fll" 
991. .1106 

/ rp t_f ami 1 y= "MER4 5 B " 
complement (1314 . . 1627) 
/rpt_family="AluSx n 
complement (2137 . .2430) 
/ r p t_f ami 1 y= "AluY " 
complement (2568 . .2741) 



/ rpt__f amily="LlM4" 
repeat_region complement (2742 . . 3047) 

/ rpt_family="AluSx" 
repeat_region complement (304 8 . . 3165) 

/ r p t_f ami 1 y = " L 1M4 " 
repeat_region 4735. .4865 

/rpt_family="FLAM C" 
repeat_region 5657. .5762 ~~ 

/rpt_family="LlMC/D" 
repeat_region 5906. .6237 

/ rpt_family="LTR21B" 
repeat_region 6289. .6773 

/ r p t_f ami 1 y = " HERVFH2 1 " 
repeat_region complement (8725. . 9597) 

/ rpt__f amily= M MERllD" 
STS 12399. .12689 

/ standard_name="136046" 
repeat_region 13774. .13816 

/rpt_family= M Alu" 
repeat_region 13817. .13874 

/rpt_f amily=" (TA) n" 
repeat_region complement ( 15157 . . 15633) 

/rpt_family="L2" 
repeat__region 15706. .15747 

/ rp t_f amil y= "AT rich " 

repeat_region 16025. .16235 

/rpt_family="MIR" 
repeat_region 16560. .16682 

/rpt_family="L2" 
repeat_region complement ( 16710 . . 17265) 

/ rp t_f ami ly= " LTR4 9 " 
repeat_region 18077. .18368 

/ rpt_f amily="AluSx" 
repeat_region complement (18376. . 18471) 

/rpt_family="L2" 
repeat_region complement ( 184 86 . . 18859) 

/ r p t_f ami 1 y= "MER5 7 B " 
repeat_region complement (20618 . .20922) 

/rpt_family= ,, AluSx" 
repeat_region 21337. .21363 

/ rp t_f amily= "AT_ri ch " 
repeat_region 22155. .22561 

/rpt_f amily="LlM4 " 
repeat_region complement (22608 . .22659) 

/ rp t_f ami ly="L 1M4 " 
repeat_region 22685. .23013 

/ r p t_f ami 1 y = " L 1MB 8 " 
repeat_region 23103. .23399 

/ r p t _f ami 1 y= " Al uSg" 
repeat_region 23500. .23973 

/ rp t_f ami ly= " L1ME3A" 
repeat_region complement (24 02 7 . . 24305) 

/rpt_family="LlMBl" 
repeat_region 24304. .24655 

/ rp t_f amil y= " L1ME3A" 
repeat_region 24656. .24697 

/ r p t_f ami 1 y= "MADE 1 " 



repeat_region 25203. .25518 

/rpt_family="AluJo" 
repeat_region 25783. .25817 

/rpt_f amily=" (TAA) n" 
repeat_region 26187. .26211 

/ rp t_f ami 1 y= "AT__rich " 
repeat_region 27014. .27030 

/rpt_family="AT_rich n 
repeat_region complement (27031 . .27316) 

/rpt_family="AluSx" 
repeat_region 27317. .27328 

/ rpt_f amily="AT_rich" 
repeat_region 27574. .27615 

/rpt_f amily=" (TAGA) n" 
STS 28062. .28166 

/ s t anda rd_name= "24707" 
STS 28199. .28382 

/standard_name=" 13170" 
repeat_region complement (29079 . .29167) 

/ r p t_ f ami 1 y = "MLT 1 J " 
repeat_region 29168. .29532 

/rpt_family="THElB" 
repeat_region complement (29533. .29552) 

/rpt_family="MLTU" 
repeat_region 29807. .30387 

Query Match 38.3%; Score 590.2; DB 9; Length 90343; 

Best Local Similarity 75.5%; Pred. No. 5.8e-121; 

Matches 7 60; Conservative 0; Mismatches 243; Indels 4; Gaps 2; 

Q v 46 GGCACAGAATTTATCTTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTA 105 

MM I I I I I I I M I I I I I I I I I I I I I | | | | | | | | | | | | M | | || 
Db 80664 GGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAAAGTACTA 80723 

QY 106 CCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGT 165 

I I I I I I I I I I I I || | | | | | | | | | M I I I I I I I I I I I I | | | | | 

Db 80724 CCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCATTGTTGT 80783 

Qy 166 GTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCT 225 

I I I I I I I I MINI I I M I I II I I I I I I I I I M III I I I I I I I I II I I I I 
Db 8 0784 TTACGGCTACATCTTCTCTCTGT^AGAACTGGAACAGCAGTAATATTTATCTCTTTAACCT 80843 

Qy 226 TTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAA 285 

1 1 1 m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii i M 1 1 1 1 1 1 1 minimum 

Db 8084 4 CTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTTATGCCAA 8 0903 

Qy 286 T GAT AAG GG GAC C T AT GGAGAT GT T CT CT GT AT AAG C AAC C GAT AT GT GC T T CAC AC C AA 345 

M 'I I N M I I I I I I I I I M I I I I I I I M II M I I II I I I I I || I | | M 
Db 80904 T GGAAAC T G GAT AT AT GGAGAC GT GCT CT GC AT AAG CAAC C GAT AT GT G CT T CAT GC C AA 80963 

Qy 346 C CT C T AC AC C AG CAT C CT CT T C CT CAC T T T CAT TAG CAT G GAC C GAT AT CT GC T CAT GAA 405 

I I I I I I I I I I I I I I I I I I | M I I I I I I II M | | | || | | | | | || | || M 
Db 80964 C CT CT AT AC C AG CAT TCTCTTTCT CAC T T T TAT CAGC AT AGAT C GAT AC TT GATAAT TAA 81023 

Qy 406 GTACCCTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGC 465 

I 1 I I I I I I I I I I Mill I MM | | | | | 

Db 81024 GTATCCTTTCCGAGAACACCTTCTGCAAAAGAAAGAGTTTGCTATTTTAATCTCCTTGGC 81083 



Q y 466 TGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCC 525 

I ' I I I I M II I I I I | | II | | | | | | |||,| 

Db 81084 CAT T T GG GT T T T AGTAAC C T T AGAGT T AC T AC C CAT AC T T C C C C TT ATAAAT C C T GT T AT 81143 

Q y 52 6. ^AAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAACACAATCT 585 

11 1 I I I I I I I I I I I II I I I II I I I I I I I I I II I I I I I I | | I 
Db 81144 AACT GACAAT GG C AC C AC CT GT AAT GAT TT T GC AAGT T C T GGAGAC C C CAACTACAAC C T 81203 



Q y 586 CATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTT 645 

1 I I M I I I M I II || || | M | I II I I I I I I I I I I I | | | Ml II 

CATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGATGTGTTT 81263 



Db 81204 



Q y 646 CTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCT 7 05 

M I I I II M I II I I I I I I I | M I I II I I I I I I I I I I M I I II 

CTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTACTGCTCT 



Db 81264 



81323 



Q ¥ 706 GCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTT 7 65 

1 11 I I I I I I I I I I || I I I I I II I I I I I I I | | | | | | | | | | | | 

GCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTGTGCTTTT 81383 



Db 81324 



Qy 766 C AC AC CCT AT CAT AT CAT GC G CAAT T T GAG GAT C G C CT C AC GC C T GGAT AGT T G GC C 822 

' I I I I 1 I I I I I | | | | | | | || | | | | | | | | | | | | || | | | | | | | | | | | | M 
TACAC CCTAT CAC GTCAT GC GGAAT GT GAGGATCGCTT CACGCCTGGGGAGTT GGAAGCA 



Db 81384 



81443 



Qy 823 ACAAGGAT GT ACAC AGAAG GC C AT CAAAT CT ATAT ACAC ACT GAC AC GGCCTCTGGCCTT 882 

1 M I I I II I I I | Mill II I I I I I I I I I I I | | M I 

GTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTTTGGCCTT 



Db 81444 



81503 



Q y 883 T CT GAACAGT GCCAT CAAT C CCAT CTT CTACTTCCT CAT GGGAGAC CATT ACAGAGAGAT 942 

I I I I I I M I M I I II I I I I I I I I I I I I | | | | | | | | M | | | | | 

Db 81504 T CT GAAC AGT GT CAT CAAC CCTGTCTTC TAT T T T CT T T T GGGAGAT CAC T T CAGGGAC AT 81563 

Qy 943 GC T GAT T AGTAAGTT CAGACAATAC T T CAAGT C C CT T AC AT C C TT C AGGAC AT GAGCT G C 1002 

MINI I I I I MIM II MM I I Ill) M I III III 

Db 81564 GC T GAT GAAT CAACT GAGAC ACAACT T CAAAT C C CT TACAT C C TTTAGC AGAT GGGCT C A 81623 

Qy 1003 T GGAT G C AGGT CTT C ACT C AGC CAAAA- T GAGAC ACTT GAT AAACAG 104 8 

HI I Jill f f I I I I I I I M I MM I I || M 

Db 81624 TGAAC T C C T ACT T T CAT T CAGAGAAAAGT GAG GG GCTT GT GAAAC AG 81670 

RESULT 15 
AC068647 

LOCUS AC068647 132745 bp DNA linear PRI 24-JUL-2002 

DEFINITION Homo sapiens 3 BAC RP11-64D22 (Roswell Park Cancer Institute Human 

BAC Library) complete sequence. 
ACCESSION AC068647 

VERSION AC068647.10 GI:19774263 

KEYWORDS HTG. 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 132745) 

AUTHORS Muzny,D.M., Adams, C, Adio-Oduola, B . , Ali-osman, F. R ., Allen, C, 
Alsbrooks,S.L., Amaratunge, H . C . , Are,J.R., Ayele,M., Banks, T., 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



Barbaria,J., Benton, J., Bimage,K., Blankenburg, K. , Bonnin,D., 
Bouck,J., Bowie, S., Brieva,M., Brown, E., Brown, M. , Bryant, N. P., 
Buhay,C, Burch,P., Burkett,C, Burrell, K. L . , Byrd,N.C, 
Carron,T.F., Carter, M. , Cavazos , S . R. , Chacko,J., Chavez, D., 
Chen,G., Chen,R., Chen, Z . , Chowdhry, I . , Chris topoulos , C . , 
Cleveland, C. D. , Cox,C, Coyle,M.D., Dathorne, S . R . , David, R. , 
Davila,M.L. , Davis, C. , Davy-Carroll, L . , Dederich, D . A. , 
Delaney,K.R. , Delgado,0., Denn,A.L., Ding,Y., Dinh,H.H., 
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Gabisi,A., Gao,J., Garcia, A. , Garner, T . , Garza, N., Gill,R., 
Gorrell, J.H., Guevara, W., Gunaratne, P . , Hale,S., Hamilton, K., 
Harris, C, Harris, K., Hart,M., Havlak,P., Hawes,A., He,X., 
Hernandez, J., Hernandez, 0 . , Hodgson, A. , Hogues,M., Holloway,C, 
Hollins,B., Homsi,F., Howard, S., Huber,J., Hulyk,S., Hume, J., 
Jackson, L. E. , Jacobson,B., Jia,Y., Johnson, R. , Jolivet,S., 
Joudah,S., Karlsson,E., Kelly, S., Khan,U., King,L., Korvah,J., 
Kovar,C, Kratovic,J., Kureshi,A., Landry, N., Leal,B., Lewis, L.C., 
Lewis, L., Li, J., Li,Z., Lichtarge, 0 . , Lieu,C, Liu, J., Liu,W., 
Loulseged,H. , Lozado,R.J., Lu,X., Lucier,A., Lucier,R., Luna,R., 
Ma, J., Maheshwari,M. , Mapua,P., Martin, R., Martindale, A. , 
Martinez, E., Massey,E., Mawhiney,E., McLeod,M.P., Meador,M. , 
Mei,G., Metzker,M., Miner, G., Miner, Z., Mitchell, T., Mohabbat,K., 
Moore, S., Morgan, M. , Moorish, T., Morris, S., Moser,M. , Neal,D., 
Nelson, D., Newtson,J., Newtson,N., Nguyen, A. , Nguyen, N., Nguyen, N., 
Nickerson,E. , Nwokenkwo, S . , Oguh,M., 0kwuonu,G., Oragunye,N., 
Oviedo,R., Pace, A., Payton,B., Peery,J., Perez, L., Peters, L., 
Pickens, R., Primus, E., Pu,L.L., Quiles,M., Ren,Y., Rives, M. , 
Rojas,A., Rojubokan,I. , Rolfe,M., Ruiz,S., Savery,G., Scherer,S., 
Scott, G., Shen,H., Shooshtari,N. , Sisson,I., Sodergren, E . , 
Sonaike,T., Sparks, A., Stanley, H., Stone, H., Sutton, A. , Svatek,A., 
Tabor, P., Tamerisa,A., Tamerisa,K., Tang,H., Tansey,J., Taylor, C, 
Taylor, T., Telfrod,B., Thomas, N., Thomas, S., Usmani,K., Vasquez,L., 
Vera, V., Villalon,D., Vinson, R., Wang,Q., Wang,S., Ward-Moore, S . , 
Warren, R. , Washington, C . , Watlington, S . , Williams, G., 
Williamson, A. , Wleczyk,R., Wooden, S., Worley,K., Wu,C, Wu,Y., 
Wu,Y.F., Zhou, J., Zorrilla,S., Naylor,S.L., Weinstock,G. and 
Gibbs, R. 
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On Mar 28, 2002 this sequence version replaced gi: 19718616 
INFORMATION: http://www.hgsc.bcm.tmc.edu/ or email 
gc-help@bcm. tmc . edu 



CLONE LENGTH: This sequence does not necessarily represent the 
entire insert of this clone. Overlapping regions of clones are only 
sequenced and submitted once, so the sequence for the remainder of 
the insert may be found in the record for the adjacent clones. 
Overlapping clones are noted at the beginning and end of the 
Features listing. 

ANNOTATION OF FEATURES: 

STSs are identified using ePCR (Genome Res. 7:541-550) searches 
of a local database that includes entries from dbSTS, GDB, and 
local mapping efforts. 

Repeats are identified using RepeatMasker (A. Smit and P. Green, 
unpublished.) for Human and Mouse sequences. 

Genes and Region of sequence similarity are identified by BLAST 
(Nuc. Acids Res. 25:3389-3402) similarity (expect < le-34) to the 
EST and cDNA sequences. Genes demonstrate at least two exons 
flanked by consensus splice sites that maintained sequence 
continuity across the splice junctions. Sequences that are not 
identical matches are annotated as similar. 

SEQUENCING READ COVERAGE : Sequencing is completed to a minimum 
standard of double strand coverage with a minimum of 2 clones and 2 
reads with no ambiguities or 2 chemistries with a minimum of 2 
clones and 3 reads with no ambiguities. If the sequence quality for 
a region does not meet this standard, it will be indicated in the 
annotation as Low Coverage. 



QUALITY OF INDIVIDUAL BASES: This sequence meets stringent quality 
standards - estimated error rate less than 1 per 10,000 bases. 
Reports of lowest quality individual bases and measures of base 
quality are listed below. Description of the metrics can be found 
at URL: 

http : / / gc . bcm. tmc . edu : 8 0 8 8 /quality . inf o/genbank . annotation . html . 



QUAL STAT -RE PORT . 
FEATURES Location/Qualifiers 
source 1. .132745 

/organism="Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9606" 
/ ch r omo s ome ="3" 
/clone= T, RPll-64D22" 
1. .2005 

/note="overlaps bases 170209. 
/function="clone overlap" 
30. .130 

/ standard_name="744 93" 
complement (522 . . 1015) 
/ r p t_f ami 1 y = "MLT ID" 
complement (2452. .2697) 
/ rpt_f amily="LlMA5A" 
complement (3200. .3578) 
/ rp t__ f ami 1 y = "MLT IB" 
3600. .3749 
/rpt_family=" (TA)n M 
4391. .4411 
/ rp t_f ami 1 y= " AT_r i ch " 
4909. .4960 
/ rp t_ f ami 1 y = "AT_ri ch " 
complement (5657. .64 03) 
/ rp t__f ami 1 y = " L 1 PA1 3 " 
6404. .7799 
/ rpt_family="LlPA13" 
8045. .8318 

/ standard__name="18364 7" 
complement (8708 . . 9282) 
/ rp t_f ami 1 y= " L 1MD3 " 
complement (9287 . . 9357) 
/ r p t_ f ami 1 y = " MLT 1 F 1 " 
complement (935 9. .94 60) 
/ rp t_f ami 1 y= " L IMC 3 " 
complement (958 7 . .9880) 
/rpt_family="AluSg" 
complement (10203. . 10450) 
/ rpt_f amily="LlMC4 " 
11468. .11699 
/rpt__f amily= M LlM4 " 
11717. .11886 
/ rpt_f amily="MLT2CB" 
11908. .11982 

/ rpt_f amily="Tigger3 (Golem) " 
12020. .12246 
/ rpt_f amily="THElC" 
12263. .12562 
/rpt_family= n AluSx" 
13326. .13346 
/rpt_family="AT_rich" 
complement (13934. . 14245) 
/rpt_family="AluSx" 
complement (14256. . 14568) 



misc_f eature 
STS 

repeat__region 
repeat_region 
repeat_region 
repeat_region 
repeat_region 
repeat_region 
repeat_region 
repeat_region 
STS 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 

repeat_region 



,172213 of clone AC069067" 



/ r p t_f ami 1 y = " Al u Jo " 
repeat^region 14618. .14746 

/ r p t_ f ami 1 y= "MER8 " 
repeat_region 14825. .14849 

/ r p t_ f ami 1 y = " AT_r i ch " 
repeat_region 14865. .14906 

/ rpt_f amily= "AT_ri ch " 
repeat_region 15465. .15739 

/rpt_family="L2" 
repeat_region 16579. .16756 

/rpt_f amily=" (TTATA) n" 
repeat_region complement ( 16757 . . 17074) 

/rpt_family="L2" 
repeat_region 17621. .17660 

/rpt_f amily=" (CAAAA) n" 
repeat_region 18544. .18725 

Query Match 38.3%; Score 590.2; DB 9; Length 132745; 

Best Local Similarity 75.5%; Pred. No. 5.9e-121; 

Matches 760; Conservative 0; Mismatches 243; Indels 4; Gaps 2; 

^ 46 GGCACAGAATTTATCTTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTA 105 

1111 1 1 1 I I I I I I I I I | M | | | | | | | M | | | | | | | | M I I I I II 
123124 123065 ^^^^^^^^^^^-^^^TGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAAAGTACTA 

Qy 106 CCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGT 165 

11111 M II II I I I I I I | | | || | | | | | || | || | || || {{I || 

Db 123125 CCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCATTGTTGT 

123184 

Qy 166 GTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCT 225 

I 1 1 1 I I II I I I I I I I I I I I I I I I I I I M I I I I I I M II 

1232 44 TTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCTTTAACCT 

Qy 226 TTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTATGCCAA 285 

II M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
123304 123245 CTCTGTCTCTGACTTAGCTTTTCTGTGC ACCCTCCCCATGCTGATAAGGAGTTATGCCAA 

Qy 286 TGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTCACACCAA 345 

11 'I 111 I I I I I I I I II I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 123305 TGGAAACTGGATATATGGAGACGTGCTCTGCATAAGCAACCGATATGTGCTTCATGCCAA 

123364 

Qy 34 6 CCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCATGGACCGATATCTGCTCATGAA 4 05 

1 1 I I 1 1 I I I I I I M I I I I I I | | | | | M II I I I M | | | | | | | || | || ,| 

123424 123365 CCTCTATACCAGCATTCTCTTTCTCACT TTTATCAGCATAGATCGATACTTGATAATTAA 

Qy 406 GTACCCTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGC 465 

HI I I M I I I I I I I I | M | | | | || | | | || | || | | | | | | | | | | | | | | | | M | | 

Db 123425 GTATCCTTTCCGAGAACACCTTCTGCAAAAGAAAGAGTTTGCTATTTTAATCTCCTTGGC 

12 34 84 

Qy 466 TGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTCCC 525 



Db 123485 
123544 



I ' I I I I I I I I I I I I I | | | | | | | | | | | | | | | | M | | | | | M 
CATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATCCTGTTAT 



Qy 526 AAAAGAAGAG G G C AGT AACT GC AT C GACT AT G C AAGT T C T G GAAAC C C T GAAC AC AAT C T 5 85 

11 ' I I I I II I I I I | | | | | | | | | | | | | || | | || | | | nun 
AAC T GACAAT G GCAC C AC C T GT AAT GAT T TT GCAAGT T CT G GAGAC C C C AACT ACAAC CT 



Db 123545 
123604 



Qy 586 CATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTT 645 

I I I I I I I I II I I I II II | | | | | | | | | | | | | | | | | | I I I I I I | || 

CATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGATGTGTTT 



Db 123605 
123664 



Qy 64 6 CT T C T ACT ACAAGAT G GT AGT CT T C TTAAAGAGGAG GAG C CAG CAGCAAG CAACT GC C C T 7 05 

1 1 1 M I I I I I I II I I I | | | || | | | |M | M I I I I I I I 

CTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTACTGCTCT 



Db 123665 
123724 

Qy 706 

Db 123725 
123784 



GCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTT 7 65 
1 1 1 I < I I I I I I I Mill I I || M II II I I || | | | | | | | | || 

GCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTGTGCTTTT 



Qy 7 66 CACAC C CT AT CAT AT C ATG C G CAAT T T GAGGAT C GC CT C AC GCC T GGAT AGT T G G C C 822 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
TACAC CCTAT CAC GT CATGCGGAAT GT GAGGAT CGCTT CACGCCT GGGGAGTT GGAAGCA 



Db 123785 
123844 



Qy 823 ACAAGGAT GT AC AC AGAAGGC C AT C AAAT C TAT AT AC AC ACT G AC AC GGCCTCTGGCCTT 882 

1 I I M I I I I I I I I I I I I I I I I I I I I | | | | 

GT AT CAGT G C ACT C AGGT C GT CAT CAAC T C C T T T T AC ATT GT GACAC GGCCTTTGGCCTT 



Db 123845 
123904 



Qy 883 T CT GAACAGT GC C AT CAAT C C CAT CT T CT ACT T C CT C AT GG GAGAC CAT T AC AGAGAGAT 942 

I I I N I I I I I I II I | | | | | | | | || | | | | || | | | | | | | M | II I I I II 
T CT GAAC AGTGTC AT CAAC CCTGTCTTC T AT TTT CT T T T GGGAGAT C AC TT C AG GGAC AT 



Db 123905 
123964 



Qy 943 G C T GAT T AGT AAG T T CAG AC AAT AC T T C AAG T C C C T T AC AT C C T T CAG G AC AT GAG C T G C 1002 

1 M I I M I I M I I I I I I I I I II I || | | | || | | | | | | | | | 

GCT GAT GAAT CAACT GAGACACAACT T CAAAT C C CT T AC AT C C T T T AGC AGAT GG GCT C A 



Db 123965 
124024 



Qy 1003 T GGAT GCAGGT C T T CACT C AGC CAAAA- T GAGAC ACT T GATAAAC AG 104 8 

' I I I I I I I I I I I I I || | | | | MM I I || I I 

Db 124025 T GAACT C CT AC TTT CAT T CAGAGAAAAGT GAG GGG CT T GT GAAACAG 124071 



Search completed: August 24, 2004, 14:51:12 
Job time : 6260 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on : 
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Perfect score: 
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US-09-891-138A-1 
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1 gctcctggcagagttttctg tgcctaaataaatcaatata 1543 
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Gapop 10.0 , Gapext 1.0 



Searched: 3373863 seqs, 2124099041 residues 

Total number of hits satisfying chosen parameters; 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
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RESULT 1 
ABK12957 

ID ABK12957 standard; DNA; 1543 BP. 
XX 

AC ABK12957; 
XX 

DT 09-APR-2002 (first entry) 
XX 

DE DNA sequence of mouse G-protein coupled receptor TGR18 gene. 
XX 

KW Mouse; G-protein coupled; receptor; GPCR; TGR18; kidney disease; 

KW signal transduction modulator; cerebral cavernous malformation; 

KW hyperlipidemia; obesity; dyslexia; cardiac myxoma; renal failure; 

KW nephritis; hypertension; liver disease; cirrhosis; blood disorder; 



KW spleen-associated disorder; immune disorder; gene; ds . 
XX 

OS Mus sp. 
XX 

FH Key Location/Qualifiers 

FT CDS 44. .997 

FT /*tag= a 

FT /product= "Mouse G-protein coupled receptor TGR18" 

XX 

PN WO200200719-A2. 
XX 

PD 03-JAN-2002. 
XX 

PF 25-JUN-2001; 2001WO-US02 0363 . 
XX 

PR 23-JUN-2000; 2000US-02134 61P . 
XX 

PA (TULA- ) TULARIK INC. 
XX 

PI Lin DC, Zhao J, Chen J, Cutler G; 
XX 

DR WPI; 2002-147880/19. 

DR P-PSDB; AAU74904. 
XX 

PT New G-protein coupled receptor polypeptides, useful for identifying 

PT modulators of signal transduction for treating kidney disease, 

PT hyperlipidemia, obesity, dyslexia and cardiac myxoma. 
XX 

PS Claim 18; Page 58; 78pp; English. 
XX 

CC The present invention relates to a new G-protein coupled receptor (GPCR) 

CC polypeptide comprising greater than 7 0% amino acid sequence identity to 

CC the amino acid sequence of human GPCRs TGR62, TGR21, TGR130.1, TGR130.2, 

CC human TGR213 or TGR92, 80% amino acid sequence identity to mouse TGR18 or 

CC 90% amino acid sequence identity to human novel edg receptor protein, as 

CC defined in the specification. The GPCR covalently linked to a solid phase 

CC is useful for identifying a compound that modulates signal transduction. 

CC The identified compounds are useful for treating kidney disease, cerebral 

CC cavernous malformations, hyperlipidemia, obesity, dyslexia and cardiac 

CC myxoma. The molecules of the invention are useful for diagnosing 

CC disorders or conditions such as kidney-related conditions or diseases 

CC such as renal failure, nephritis, nephrotic syndrome, asymptomatic 

CC urinary abnormalities, renal tubule defects, hypertension and 

CC nephrolithiasis, liver-related disease or condition e.g. cirrhosis, 

CC infiltrations, lesions, functional disorders and jaundice and spleen- 

CC associated disorders or conditions e.g. splenic enlargement, immune 

CC disorders, blood disorders and others. Modulation of the polypeptide of 

CC the invention is useful to treat or prevent any of the above conditions 

CC or diseases. The present nucleic acid sequence encodes the mouse GPCR 

CC TGR18 protein of the invention. This sequence encodes one of seven novel 

CC G protein coupled receptors of the invention (ABK12957- ABK12964) 

XX 

SQ Sequence 1543 BP; 438 A; 352 C; 293 G; 460 T; 0 U; 0 Other; 



Query Match 100.0%; Score 1543; DB 6; 

Best Local Similarity 100.0%; Pred. No. 0; 
Matches 1543; Conservative 0; Mismatches 0; 



Length 1543; 
Indels 0; Gaps 



0; 



Qy 


l 


G CT C CT G GC AGAGT T T T C T GT CGAGAC AGAAG C C GAC AGCAGAAT GG C AC AGAAT TT AT C 

1 1 N 1 1 1 1 1 1 1 1 II 1 1 1 1 1 | M 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 II 1 1 | | | | | | | | | I | | | | | | 

GCT C C T GGCAGAGT TTTCTGTC GAGACAGAAG C C GAC AGCAGAAT GGC AC AGAAT T TAT C 


60 


Db 


l 


60 


Qy 


61 


TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | M 1 1 1 1 1 1 1 

TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 


120 


Db 


61 


120 


Qy 


121 


TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 

1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 II 1 1 1 1 II 1 1 1 1 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 

TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 


180 


Db 


121 


180 


Qy 


181 


CTGCATGAAGAACTGGTVACAGCAGCAATGTCTATCTTTTTAACCTTTCCATCTCTGACTT 
M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 M 1 1 1 II 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 
CTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCTTTCCATCTCTGACTT 


240 


Db 


181 


240 


Qy 


241 


TGCTTTCCTGT GC AC C C T T C C CAT C CT GAT AAAGAGT TAT G C CAAT GATAAGGG GAC CT A 

1 1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I | 1 1 1 1 1 1 1 1 II II II 1 1 1 1 1 1 1 1 1 1 

TGCTTTCCTGTG CAC C C T T C C CAT C CT GAT AAAGAGT TAT G C CAAT GATAAGGG GAC CT A 


300 


Db 


241 


300 


Qy 


301 


T GGAGAT GT T CT CT GT AT AAGCAAC C GAT AT GT GCTT C AC AC CAAC CT CT AC AC C AG CAT 

M M 1 1 1 1 1 1 1 1 1 I 1 1 1 1 || | M | | M 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 || | | | | | 

T GGAGAT GT T CT CT GT AT AAGCAAC C GAT AT GT GCTT CAC AC CAAC CT CT AC AC C AGC AT 


360 


Db 


301 


360 


Qy 


361 


CCTCTTCCT CACT T T CAT T AGCAT GGAC C GAT AT CT GC T CAT GAAGT AC C CT T T C C GAGA 

1 1 N 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 I I I I | | | | | | | | | | | | | | | | | || | | | | || | | | | | | | | | | 

CCTCTTCCT CACT T T CAT TAG CAT GGAC C GAT AT CT GC T CAT GAAGT AC C CT T T C C GAGA 


420 


Db 


361 


420 


Qy 


421 


ACACTTTCTAC/^AAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 
I N 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 I I I I I I I | | | | | | || | | | | | | | | | || | | | | | | || || | | | | 

ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 


480 


Db 


421 


480 


Qy 


481 


GAC C T T AGAAGT T CT AC C CAT GCT C ACT T T CAT CAAT T CT GT C C CAAAAGAAGAGGGC AG 

i 1 M 1 1 , 1 1 1 1 ! 1 , 1 1 1 1 1 1 1 | | | | | | | | | M M 1 II M 1 1 1 1 1 II 1 1 

GAC CT T AGAAGT T CT AC C CAT GCT C ACT T T CAT CAAT TC T GT CC CAAAAGAAGAGGGC AG 


540 


Db 


481 


540 


Qy 


541 


T AACT GC AT C GACT AT GCAAGT T C T G GAAAC C CT GAAC ACAAT CT CAT T T AC AGC CT CT G 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II II II 1 1 1 1 1 1 1 1 1 

T AACT GC AT C GACT AT GCAAGT T C T G GAAAC C CT GAAC AC AAT CT CATT T AC AGC CT CT G 


600 


Db 


541 


600 


Qy 


601 


CCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGAT 
1 1 N 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 
CCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGAT 


660 


Db 


601 


660 


Qy 


661 


GGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACC 
1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | | M 1 1 1 II 1 1 1 1 1 1 1 1 1 | M | 1 1 1 II 1 1 1 1 1 II 1 
GGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACC 


720 


Db 


661 


720 


Qy 


721 


CCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTPTrTATArTrTTr ararrrTnTraTaT 

1 1 1 1 1 1 II 1 1 II 1 1 1 || MINI Ml | | | | 

CCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTTCACACCCTATCATAT 


1 Q C\ 

1 o U 


Db 


721 


780 


Qy 


781 


CAT GC GCAATT T GAG GAT C G C C T CAC G C C T G GAT AGT T GGC C ACAAG GATGT AC AC AGAA 

1 1 N 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 Ill 

CAT G C G CAAT T T GAG GAT C G C CT CAC GC C T G GATAGT T GGC C ACAAG GAT GT ACACAGAA 


840 


Db 


781 


840 



Qy 


841 


GGC CAT CAAAT CT ATAT ACAC AC T GAC AC GGCCTCTGGCCTTTCT GAAC AGT G C CAT CAA 

1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I M I 1 
GGC CAT CAAAT CT ATAT AC AC AC T GAC AC GGCCTCTGGCCTTT CT GAACAGT G C CAT CAA 


900 


Db 


841 


900 


Qy 


901 


T C C CAT CT T C T AC T T C C T CAT GG GAGAC CAT T AC AGAGAGAT G CT GAT T AGT AAGT T CAG 
1 1 M 1 II 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 || I I I I I I | I | | | | | 
T C C CAT CT T C T AC T T C C T C AT GG GAGAC CAT T AC AGAGAGAT GCT GAT T AGT AAGT T CAG 


960 


Db 


901 


960 


Qy 


961 


ACAAT AC T T C AAGT C C C T T AC AT CC TT CAG GACAT GAGC T GCT GGAT GC AGGT C T T C ACT 
1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I | | I | | M | 
ACAAT AC T T CAAGT C C C T T AC AT CC TT C AGGACAT GAG C T GCT G GAT GC AGGT C T T C ACT 


1020 


Db 


961 


1020 


Qy 


1021 


C AGC C AAAAT GAGAC AC T T GAT AAAC AGT GC T GT GC AGT T GAGT T T T AACT AAGT AAAC C 

1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 || | | | | | | | 

CAGC CAAAAT G AGACACT T GAT AAAC AGT GC T GT GCAGT T GAGT T T TAAC TAAGT AAAC C 


1080 


Db 


1021 


1080 


Qy 


1081 


ACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTG 
N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | || || | || | | | | | | | | | | | | | | M | | | | | | | | 

ACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTG 


1140 


Db 


1081 


1140 


Qy 


1141 


GGT C C AC AT GAAT C AGAAGGC AGCT CT CT GT T CT GAT T T T AGGT TAT AC C C AGAGT AT GG 

1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | || | | | 

GGTC CACAT GAAT C AGAAGGC AGCT CT CT GTT CT GATTTTAGGTTATAC CCAGAGTAT GG 


1200 


Db 


1141 


1200 


Qy 


1201 


AAAAAATAAGG CAT GAGAAAG CAT T GACAT C T T CACTTAAGAACT GAACAAAAGAGAACA 

1 1 1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AAAAAATAAGG CAT GAGAAAG CAT T GACAT C TT CACTTAAGAACT GAACAAAAGAGAACA 


1260 


Db 


1201 


1260 


Qy 


1261 


AAT AT T GT CAAT GT T T GGAC AC T T AGGAT CT GAAAT C T T GGAAAT T T T AAGAC CT C T T T T 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | 

AATATTGTCAATGTTTGGACACTTAGGATCTGAAATCTTGGAAATTTTAAGACCTCTTTT 


1320 


Db 


1261 


1320 


Qy 


1321 


T CT AT CAGT GT AAAAGGAAT ACAAGAT AGCT AGTT GCAAAT GC T GAAT G CAT T T CAT CAT 

M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | M | | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 || | | | | 

T C TAT CAGT GTAAAAGGAAT ACAAGAT AGCT AGTT GCAAAT GC T GAAT G CAT T T CAT CAT 


1380 


Db 


1321 


1380 


Qy 

Db 


1381 
1381 


T G GT C AGGT C GATAAGC GT GT T T CT GAAAT AGT CT TATT TT TAT T CTT GT AAT AT TAAAA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 | | | | | | | | | || | | | | | | | | | | | | | 

T GGT C AG GT C GATAAGC GT GT T T CT GAAAT AGT CTTATT TT TATT C T T GTAAT AT TAAAA 


1440 
1440 


Qy 

Db 


1441 
1441 


TTTAT GT GAAAAAT GAAT AT AAT T CAAT GTACAACATTAGATTTTCTATTT GAAAATTAT 

1 1 1 1 1 1 1 1 1 1 1 I I i I I I I I i i i i i i i i i i i i i i i i i i i i i i i i i i t t t i i i « i i i i i i i i 
■ > 1 1 1 1 ! 1 1 1 1 1 1 1 1 II I E 1 M M 1 1 1 1 1 1 1 1 1 I I I I I | | | | | | | | | | | | | | | | | | | | | | 

T T TAT GT GAAAAAT GAAT AT AAT T CAAT GT AC AAC AT T AGAT T T T CT AT T T GAAAAT TAT 


1500 
1500 


Qy 


1501 


AT T T CT T GAAAAAAT AAC TGCTGTGCC T AAAT AAAT CAAT AT A 1543 
1 M 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 I I I I I M | 
AT T T CT T GAAAAAAT AAC TGCTGTGCC TAAAT AAAT CAAT AT A 1543 




Db 


1501 





RESULT 2 
AAA46036 

ID AAA46036 standard; cDNA; 1005 BP. 
XX 

AC AAA46036; 
XX 

DT 22-AUG-2000 (first entry) 
XX 



JJ£j 


Human G protein coupled receptor hCHNIO encoding cDNA SEQ ID NO: 37. 


yy 






JavV 


Human; G protein coupled receptor; GPCR; transmembrane receptors- 


JavV 


identification; agonist; screening; therapeutic; pharmaceutical; mutant; 




ss . 




yy 

AA 








Homo sapiens. 


yy 

AA 








WO200022131 


-A2. 


yy 

AA 






ir JJ 


20-APR-2000 




YY 
AA 






DE 1 

it r 


13-OCT-1999; 99WO-US024 065 . 


YY 
AA 






DD 

ir K 


lo-OCT-199 8 


; 98US-00170496. 


DD 
IT K 


±z— .nov— iy y a 


; 98US-0108029P . 


PD 

IT K 


zu— Nov-iyy o 


; 98US-0109213P . 


JrK 


z / — inuv— iyyy 


; 98US-0110060P. 


ir K 


16-FEB-1999 


; 99US-0120416P. 


DD 

Jr K 


Z o-FEB-1999 


; 99US-0121852P. 


DD 


Iz— MAR- 199 9 


; 99US-0123944P. 


PD 
JrK 


1 Z - MAR- 1 9 y 9 


; 99US-0123945P. 


DD 
Jr K 


I Z -MAR- 19 9 9 


; 99US-0123946P . 


DD 

JrK 


lz-MAR-1 999 


99US-0123948P. 


PD 
Jr K 


1 Z -MAR- 1 9 y 9 


; 99US-0123949P . 


DD 
Jr K 


i z — mak- i y y y 


; 99US-0123951P . 


DD 

JrK 


z o —may — i y y y 


; 99US-0136436P . 


DP 
ir K 


z y-MAY-iy y y 


r 99US-0136437P . 


DD 
Jr K 


za-MAY-1999 , 


; 99US-0136439P. 


DD 
Jr K 


28-MAY-1999, 


99US-0137127P. 


DP 
Jr K 


28-MAY-1999, 


99US-0137131P. 


PP 
Jr K 


28-MAY-1999, 


99US-0137567P. 


PD 
Jr K 


29-JUN-1999, 


99US-0141448P. 


PP 
JrK 


27-AUG-1999, 


99US-0151114P. 


PP 
ir K 


03-SEP-1999, 


99US-0152524P. 


PP 
Jr K 


29-SEP-1999; 


99US-0156555P. 


PD 

Jr K 


29-SEP-1999; 


99US-0156633P. 


DD 
ir K 


29-SEP-1999; 


99US-0156634P. 


DD 
Jr K 


29-SEP-1999; 


99US-0156653P. 


PP 
Jr K 


01-OCT-1999; 


99US-0157280P. 


PP 
ir K 


01-OCT-1999; 


99US-0157281P. 


PP 
JrK 


01-OCT-1999; 


99US-0157282P. 


PD 
Jr K 


01-OCT-1999; 


99US-0157293P. 


DD 

ir K 


01-OCT-1999; 


99US-0157294P. 


PR 


12-OCT-1999; 


99US-00416760. 


PP 
Jr t\ 


12-OCT-1999; 


99US-00417044. 


AA 






P A 


(AREN-) ARENA PHARM INC. 


V V 
AA 






DT 

ir _L 


Behan DP, Lehmann-Bruinsma K, Chalmers DT, Chen R, Dang HT; 


PI 


Gore M, Liaw CW, Lin I, Lowitz K, White C; 


XX 






DR 


WPI; 2000-317986/27. 


DR 


P-PSDB; AAB02842. 


XX 






PT 


Non-endogenous, human G protein-coupled receptors for screening receptor, 


PT 


inverse or partial agonists useful as therapeutic agents. 



Example 1; Page 116; 187pp; English. 



XX 
PS 
XX 

CC The present invention describes transmembrane receptors, preferably human 

CC G protein coupled receptors (GPCR) , for which the endogenous ligand is 

CC unknown (orphan GPCR receptors). More specifically the present invention 

CC relates to non-endogenous, constitutively activated versions of a human 

CC GPCR. These non-endogenous human GPCRs can be useful for the direct 

CC identification of candidate compounds as receptors agonists, inverse 

CC agonists or partial agonists for use as pharmaceutical agents. AAA46017 

CC to AAA46126 and AAB02825 to AAB02859 represent sequences used in the 

CC exemplification of the present invention 
XX 

SQ Sequence 1005 BP; 248 A; 236 C; 196 G; 325 T; 0 U; 0 Other; 



Query Match 38.4%; Score 592.4; DB 3 

Best Local Similarity 75.5%; Pred. No. 1.3e-139 
Matches 750; Conservative 0; Mismatches 241 



Length 1005; 

Indels 3; Gaps 1; 



QY 39 G CAGAAT GGC AC AGAAT T TAT C T T GT GAGAAT T G GT T GGCAAC AGAG G CT AT C T T GAAT A 98 

II I M I I I I I I I I I I I I I I I I I I I Ml I I I I I 

Db 8 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

99 AGT ACT AC CT CT C T GC AT T T TAT GCAAT C GAGT T C ATT T T T G GACT GCT T GG GAAT GT C A 158 
I I I M I I I I I I I I I I I I II I I I I I I I I I I | | | | | | | M I I I I II 

Db 68 AGTACT AC CT TT C CAT T T T T TAT GGGAT T GAGT T C GT T GT GGGAGT C CT T GGAAAT AC C A 127 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

IN M I I II I I M M M II I I I I | | | | | | | | || | | || | Ml | | | | | | | 
Db 128 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 187 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I | Mill 
Db 188 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 247 



Qy 279 AT GC CAAT GAT AAGGGGAC CT AT G GAGAT GT T CT CT GT ATAAGCAAC C GAT AT GT G CT T C 338 

I I I I I I I I I II I I I I II II II I M Mill I I I I I I I I I I I I I I I | M I I II 
Db 24 8 AT GC C AAT G GAAACT GGAT AT AT G GAGAC GT G CT CT GC ATAAGCAAC C GAT AT GT G CT T C 307 

Qy 339 AC AC CAACCT C T AC AC C AGC AT CCTCTTCCT C ACT T T CAT T AGC AT G GAC C GAT AT CT GC 398 

I M M I I I I I I I M I M I I I I I I I II II I II I I I I II M I I I II II II 
Db 3 08 AT G C C AAC C T C TAT AC C AG CAT TCTCTTTCT C AC T T T TAT C AG CAT AG AT C GAT AC T T G A 367 

Qy 399 T CAT GAAGT AC C CT T T C C GAGAACACT T T C TACAAAAGAAGGAAT TT GC C AT T T T AAT C T 458 

I II I I I I I I II I M I I II II I I I I I I I I || | | M | M Mill I I II I II I I I 
Db 368 T AAT TAAGT AT C CT T T C C GAGAACAC CT T C T G CAAAAGAAAGAGT TT GCT AT T T T AAT C T 427 

Qy 459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I Mill M II I M II I I I I I II I I || M I I I I M I I I 
Db 428 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 487 

Qy 519 C T GT C C C AAAAGAAGAGGG CAGT AACT G CAT C GACT AT G CAAGT T CT G GAAAC C C T GAAC 57 8 

I I I I Ml I II I I I I I I | M I I I I I I II I 

Db 48 8 CTGTTATAACTGACAATGGCACCACCTGTAATGATTTTGCAAGTTCTGGAGACCCCAACT 547 



Qy 



579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 
I I I I I I I I I M I M M I M M M I I I I I I I I || | | M II I I I I I I I I II 



Db 54 8 ACAACCTCATTTACAGCATGTGTCTT^ACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 607 

Qy 639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 

I I M I I I I I I I I I I I I I M I I I I I I I I I I I I I | | | MM Ml 

Db 60 8 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 667 

QY 69 9 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I M I II M I II I I M M I I I M I I I I I I I II || I II I I I II I 

Db 668 CTGCTCTGCCCCTTGAA7\AGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 727 

QY 75 9 TACT CTT C AC AC C C TAT CAT AT CAT GC G C AAT T T GAGGAT C G C C T C AC GC CT G GAT AGTT 818 

I M M I I I I I I I I I I I I I I II I I Ml I I I I I I I I I I I I I || || II I I I II 
Db 72 8 TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 7 87 

Qy 819 G G CC ACAAGGAT GT AC AC AGAAG GC CAT CAAAT CT AT AT AC AC ACT GAC AC G GC CT C 875 

I M I I I I I I I I I I II I I I I I I I I II I I II I I II | | | 

Db 78 8 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 847 

Qy 876 TGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGGAGACCATTACA 935 

I I I M I I I I I I I I M I I I MUM II I II I I I I I I II I II I I M II I II 
Db 848 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 907 

Qy 936 GAGAGAT GCT GATTAGTAAGTT CAGACAAT ACTT CAAGT C CCTTACAT CCTT CAGGACAT 995 

I I I I I I I I II I I I I I I || I I I II I M I I I I I I I M II I I I I I I I I I 
Db 908 GGGAC AT GCT GAT GAAT C AACT GAGACACAACTT CAAAT C C C T TACAT CCTT TAG CAGAT 967 

Qy 996 GAGCTGCTGGATGCAGGTCTTCACTCAGCCAAAA 1029 

I I I I III I I I I I II II MM 

Db 968 G GGC T CAT GAACT C C T ACT T T CAT T CAGAGAAAA 1001 



RESULT 3 
AAD01135 

ID AAD01135 standard; cDNA; 1005 BP. 
XX 

AC AAD01135; 
XX 

DT 02-NOV-2000 (first entry) 
XX 

DE Human orphan G protein-coupled receptor hCHNIO cDNA. 
XX 

KW Human; orphan G protein-coupled receptor; GPCR; hCHNIO; drug screening; 
KW transmembrane receptor; expressed sequence tag; EST; signal cascade; ss, 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .1005 

FT /*tag= a 

FT /product= "hCHNIO" 

FT /note= "Human orphan G protein-coupled receptor" 

XX 

PN WO200031258-A2. 
XX 

PD 02-JUN-2000. 
XX 

PF 13-OCT-1999; 99WO-US023687 . 



XX 

PR 20-NOV-1998; 98US-0109213P . 

PR 16-FEB-1999; 99US-0120416P . 

PR 26-FEB-1999; 9 9US-012 1852P . 

PR 12-MAR-1999; 99US-012394 6P . 

PR 12-MAR-1999; 99US-0123949P . 

PR 28-MAY-1999; 99US-0136436P . 

PR 28-MAY-1999; 99US-0136437P . 

PR 28-MAY-1999; 99US-0136439P . 

PR 28-MAY-1999; 99US-0136567P . 

PR 28-MAY-1999; 99US-0137127P . 

PR 28-MAY-1999; 99US-0137131P . 

PR 29-JUN-1999; 99US-0141448P. 

PR 29-SEP-1999; 99US-0156555P . 

PR 29-SEP-1999; 99US-0156633P . 

PR 29-SEP-1999; 99US-0156634P . 

PR 29-SEP-1999; 99US-0156653P . 

PR Ol-OCT-1999; 99US-0157280P . 

PR 01-OCT-1999; 99US-0157281P . 

PR Ol-OCT-1999; 99US-0157282P . 

PR Ol-OCT-1999; 99US-0157293P . 

PR Ol-OCT-1999; 99US-0157294P . 

PR 12-OCT-1999; 99US-00416760 . 

PR 12-OCT-1999; 99US-00417044 . 
XX 

PA (AREN-) ARENA PHARM INC. 
XX 

PI Chen R, Dang HT, Liaw CW, Lin I; 
XX 

DR WPI; 2000-400068/34. 

DR P-PSDB; AAY71308. 
XX 

PT Novel human orphan G protein-coupled receptors and the encoding cDNAs for 

PT use in the identification of G protein-coupled receptor agonists. 

XX 

PS Claim 69; Page 86; 102pp; English. 
XX 

CC The present sequence is a cDNA encoding hCHNIO, an endogenous human 

CC orphan G protein-coupled receptor (GPCR) , expressed in kidney and 

CC thyroid. The hCHNIO cDNA was identified using the human EST (expressed 

CC sequence tag) 1365839 as a probe. The orphan GPCR of the invention, like 

CC all GPCRs has seven transmembrane alpha helices with an extracellular N- 

CC terminus and an intracellular C-terminus . However, no endogenous ligands 

CC has yet been identified for the proteins of the invention. The orphan 

CC GPCRs may be used in the identification of their endogenous ligands, and 

CC to screen potential GPCR agonists and antagonists for use as 

CC. pharmaceutical agents. The proteins may also be used in the study of GPCR 

CC -mediated signalling cascades, and to elucidate their precise role in 

CC normal and diseased human conditions. Nucleic acid encoding human orphan 

CC GPCRs may be used for tissue localisation expression analysis to provide 

CC information about their function in healthy and pathological states 

XX 

SQ Sequence 1005 BP; 248 A; 236 C; 196 G; 325 T; 0 U; 0 Other; 

Query Match 38.4%; Score 592.4; DB 3; Length 1005; 

Best Local Similarity 75.5%; Pred. No. 1.3e-139; 

Matches 750; Conservative 0; Mismatches 241; Indels 3; Gaps 1; 



39 GC AGAAT GG C ACAGAAT T TAT C T T GT GAGAAT T G GT T GGCAACAGAGGCT AT C T T GAAT A 98 
II 'Mill I I I I IMM | || | | | || | | | | | | | | | | | | | | | | 
8 GGATCATGGCATGGAATGCAACTTGCT^AAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

99 AGT AC T AC C T C T CT GC AT T T TAT G C AAT C GAGT T CAT T T T TG GACT GCT T G GGAAT GT C A 158 

I I I I I I I I I I I I II I I I I I I I I I I I I | | || M II I I I 

68 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 127 

159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

HI I I I I I I I I I I I I I I | | I I I II I I I II I I I I I I I I I IN I IMM I 
128 T T GT T GT T T AC G G C T AC AT CTTCTCTCT GAAGAACT G GAACAGCAGTAAT AT T TAT C T C T 187 

219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I N I I II I I I M II I I I I I II I II I I || | | | || | M I I I I I I I I I MMI 

188 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 24 7 

279 AT GC CAAT GATAAG GG GAC CT AT GGAGAT GT T CT CT GT AT AAGCAAC C GAT AT GT GC T T C 338 

I I I I I I I I I M M I II I II || | | | I I I II I I II I I II I || || M | I 

248 AT GC CAAT GGAAAC T G GAT AT AT G GAGAC GT GCT CTG CAT AAGCAAC C GAT AT GT G CT T C 307 

339 AC AC CAAC C T CT AC AC CAG CAT CCTCTTCCT C ACT T T CAT TAGCAT G GAC C GAT AT CT G C 398 

I N II I I I II I I I I I II I I | | || I I I I I I II I || | | M | || M I II M 

308 AT G C CAAC C T CT AT AC CAG CAT T CT CT T T CT C ACT T T TAT CAGC AT AGAT C GAT ACTT GA 367 
399 T CAT GAAGT AC C CT T T C C GAGAAC ACT T T CT ACAAAAGAAGGAATT T GC C AT T T TAAT CT 458 

I II I I N I I II I II II I I I II I I MM I I I I M I I II IMM II II M I II I 

368 TAAT TAAGT AT C CT T T C C GAGAACAC C T T C T GCAAAAGAAAGAGTT T GCT AT T T TAAT C T 427 

4 59 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I I I I I I M I M I I I I II II I I I I I I I II I I I I || I II 
428 CCTTGGCCATTTGGGTTTT AGT AACCTTAGAGTTACTACCCAT ACTT CCCCTTATAAATC 4 87 

519 CT GT C C CAAAAGAAGAGGGC AGTAACT GC AT C GAC TAT GC AAGT T CT GGAAAC C C T GAAC 57 8 

'''' II N I I I I I I III I I I I I II I I M I I M I 

488 CT GT T ATAAC T GACAAT G GC AC C AC CT GTAAT GAT T T T G CAAGT T CT GGAGAC C C CAACT 547 

579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

MM MINIMUM I II II II I M I II I Mill I I I M I II I IMM 

54 8 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA .607 
639 TGTGCTTCTTC T ACT ACAAGAT GGT AGT CTT C T T AAAGAGGAGGAGC CAG CAG CAAGCAA 698 

I I I I M M I II I II II II I I II II I I Ml MM III 

608 T GT GT T T C T T T TAT T ACAAGAT TGCTCTCTTC CT AAAG C AGAGGAAT AGG C AGGT T GC T A 667 

699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 
I I I I M II I I I II I | || | , | || 

668 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 727 

759 T AC T CT T CACAC C C TAT CAT AT CAT GC G CAAT T T GAG GAT C GC C T C AC GC C T GGAT AGT T 818 

I II II IN Ml I I I I I II II I I II I || M I I I II I I I | M I I I I I 

728 TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 787 

819 G G C C ACAAGGAT GT AC ACAGAAGGC CAT CAAAT C T AT AT AC ACAC T GAC AC G GC C T C 875 

noo 1 M I M M III MM Mill MMIMIMI 

78 8 G GAAG C AGT AT CAGT GCACT CAGGT C GT CAT CAAC T C CT T T T AC AT T GT GAC AC G GC C T T 847 



Qy 

Db 

Qy 

Db 

Qy 

Db 



876 TGGCCTTTCT GAAC AGT GC CAT C AAT C C CAT CT T CT ACT T C C T CAT GG GAGAC C ATT AC A 935 

I M I I I I I I I I I I I i I I I I M I I I II I I I I I I I I I II I I I I I I I I I I I I 
84 8 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 907 

93 6 GAGAGAT G C T GAT T AGT AAGT T C AGAC AAT AC T T C AAGT CC CT TAC AT C C T T C AG GACAT 995 

I M I M I I I I I I I I I I I I I I II I II I I I | | | | | || | || || | | | | | | 
908 G G GACAT G C T GAT G AAT C AAC T G AG AC AC AAC T T C AAAT C C C T TAC AT C C T T TAG C AG AT 967 

996 GAG C T GC T GGAT G CAGGT CT T C ACT C AG C CAAAA 1029 

I I M III I I I I I I I I I I I I I 

968 GG G C T CAT GAAC T C CT AC T T T CAT T C AGAGAAAA 1001 



ACA93273; 

16-JUL-2003 (first entry) 
Human cDNA encoding GPCR hCHNlO. 

Human; ss; gene; orphan G protein-coupled receptor; GPCR; hARE-3; hARE-4; 
hARE-5; hRUP3; hRUP5; hRUP6; hRUP7; hGPCRZ7; hARE-1; hARE-2 ; hPPRl; hG2A; 
hCHN3; hCHN4; hCHN6; hCHN8; hCHN9; hCHNIO; hRUF4 ; signalling cascade. 

Homo sapiens. 

US2003017528-A1. 

23-JAN-2003. 

06-JUN-2001; 2001US-00875076 . 



RESULT 4 
ACA93273 

ID ACA93273 standard; cDNA; 1005 BP. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
XX 
PA 
PA 
PA 
PA 
XX 



20-NOV-19 98 
16-FEB-1999 
26-FEB-1999 
12-MAR-1999 
12-MAR-1999 
28-MAY-1999 
28-MAY-1999 
28-MAY-1999 
28-MAY-1999 
28-MAY-1999 

28- MAY-1999 

29- JUN-1999 

28- SEP-1999 

29- SEP-1999 
29-SEP-1999 
12-OCT-1999 



98US-0109213P. 
99US-0120416P. 
99US-0121852P. 
99US-0123946P. 
99US-0123949P. 
99US-0136436P. 
99US-0136437P. 
99US-0136439P. 
99US-0136567P. 
99US-0137127P. 
99US-0137131P. 
99US-0141448P. 
99US-0156333P. 
99US-0156555P. 
99US-0156634P. 
99US-00417044 . 



(CHEN/) CHEN R. 

(DANG/ ) DANG H T. 

(LIAW/) LIAW C W. 

(LINI/) LIN I. 



Chen R, Dang HT, Liaw CW, Lin I; 

WPI; 2003-428952/40. 
P-PSDB; ABU92276. 

Novel endogenous, orphan, human G protein-coupled receptors useful for 
identification of modulators of the receptor and as research tools for 
understanding the role of the receptor in human body. 

Claim 69; Page 40-41; 54pp; English. 

The invention relates to a human G protein-coupled receptor (GPCR) 
appearing as ABU92259-ABU92277 (encoded by cDNAs ACA93256-ACA93274 ) named 
hARE-3, hARE-4 , hARE-5 , hRUP3, hRUP5, hRUP6, hRUP7, hGPCRZ7, hARE-1, hARE 
-2, hPPRl, hG2A, hCHN3, hCHN4 , hCHN6, hCHN8 , hCHN9, hCHNIO and hRUF4 . 
Also included are a plasmid comprising a vector and one of the cDNAs 
above and a host cell comprising the plasmid. The GPCRs are useful for 
the direct identification of candidate compounds as inverse agonists, 
agonists or partial agonists. In vitro and in vivo systems incorporating 
GPCRs is useful for elucidating and understanding the roles these 
receptors play in the human condition, both normal and diseased, as well 
as understanding the role of constitutive activation as it applies to 
understanding the signalling cascade. The cDNAs are useful for making a 
probe for dot-blot analysis against tissue mRNA and/or RT-PCR 
identification of the expression of the receptor in tissue samples. The 
present sequence is a cDNA encoding a GPCR of the invention 

Sequence 1005 BP; 248 A; 236 C; 196 G; 325 T; 0 U; 0 Other; 



Query Match 38.4%; Score 592.4; DB 7; Length 1005; 

Best Local Similarity 75.5%; Pred. No. 1.3e-139; 

Matches 750; Conservative 0; Mismatches 241; Indels 3; Gaps 1; 



Qy 


39 


GCAGAATGGCACAGAATTTATCTTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATA 
II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 II 1 1 I I I I I I | 
GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 


98 


Db 


8 


67 


Qy 


99 


AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 
1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I | | | | | | | | | | | | | | | | | | || | M 

AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 


158 


Db 


68 


127 


Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 
Ml M 1 1 I 1 1 1 1 1 1 1 I I I | 1 1 1 1 1 1 1 1 1 1 1 I I I I I I I | Ml | | | | M 1 
TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 


218 


Db 


128 


187 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 
■ M ■ 1 1 II 1 1 II II 1 1 1 1 Mill 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 || Mill 
TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


188 


247 


Qy 


279 


ATGCCAATGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTC 

UN 1 1 1 II MIMMMMMMIM 

ATGCCAATGGAAACTGGATATATGGAGACGTGCTCTGCATAAGCAACCGATATGTGCTTC 


338 


Db 


248 


307 


Qy 


339 


ACACCAACCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCATGGACCGATATCTGC 

1 M II 1 1 1 1 1 1 M 1 1 II 1 1 1 1 || 1 1 II 1 1 I I I I 1 II 1 I 

AT GC CAAC C T CT AT AC C AGCAT TCTCTTTCT C AC T T T TAT CAGCAT AGAT C GAT ACTT GA 


398 


Db 


308 


367 



PI 

XX 

DR 

DR 

XX 

PT 

PT 

PT 

XX 

PS 

XX 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

XX 

SQ 



Qy 


399 


T CAT GAAGT AC C CT T T C C GAGAAC AC TT T C T AC AAAAGAAGGAAT T T GC C ATT T TAAT C T 

1 II 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 II 1 II 1 1 II 1 M 1 1 1 1 1 

TAAT T AAGT AT C CT T T C C GAGAAC AC CT T C T G C AAAAGAAAGAGT T T GC T AT T T TAAT C T 


458 


Db 


368 


427 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I 1 | | | | | M 1 1 1 1 1 1 1 1 | 1 1 

CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 


518 


Db 


428 


487 


Qy 


519 


CT GT C C C AAAAGAAGAGG G C AGT AAC T GCAT C GACTAT G C AAGT T C T GGAAAC C C T GAAC 
1 1 M M II 1 I I I I I I I I | | | | | | | | | | | || || | | | M | | 
CT GT TAT AAC T GAC AAT G G C AC C AC C T GTAAT GAT T T T G CAAGTT C T GGAGAC C C CAACT 


578 


Db 


488 


547 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

MM 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 II 1 I I I I I I I I 1 I | | | | M 1 1 1 1 1 I 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


548 


607 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 

M M 1 1 1 1 1 1 1 1 1 II 1 1 1 I I 1 1 1 1 1 1 I I M I | | M Mil M I 

TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


608 


667 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

fill t J 1 f I II If I I I I I | | | M II II 1 II M 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


668 


727 


Qy 


759 


TACT C T T CAC AC C CT AT CAT AT CAT GC G CAAT T T GAG GAT C GC CT CAC GC C T GGAT AGT T 

1 M M 1 M 1 1 1 1 II 1 1 1 I 1 1 1 i Ml 1 1 1 1 | | | | | | 1 1 M 1 1 1 1 II 1 1 1 1 

T G CTT T T T AC AC C CT AT C AC GT CAT GC G GAAT GT GAG GAT C GCT T CAC G C C T GGGGAGT T 


818 


Db 


728 


787 


Qy 


819 


G G C CAC AAGGAT GT AC AC AGAAGGC CAT CAAAT C T AT AT ACAC ACT GAC AC GGC C T C 

1 M 1 1 1 1 1 1 1 1 1 1 1 II | | | | | | | | | | | M 1 II 1 1 1 I 
G GAAGC AGT AT CAGT GC AC T C AGGT C GT CAT CAAC T C CT T T T ACATT GT GAC AC GGC CT T 


875 


Db 


788 


847 


Qy 


876 


TGGCCTTTCT GAACAGT G C CAT CAAT C C CAT C T T C TACT T C CT CAT G GGAGAC CATT AC A 

M M 1 1 1 1 1 1 1 II II 1 1 1 | | | | M 1 1 II 1 1 1 II 1 1 1 1 II M 1 1 1 M 1 M 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


848 


907 


Qy 


936 


GAGAGAT GCT GATTAGTAAGTT CAGACAAT ACTT CAAGTCCCTTACAT C CT T CAGGACAT 

i ii i i i i i i i t t i i i i i i i i i i i i i * i i i i i i i i i i . . 

1 II MINIM f | | | | | | | | 1 1 1 1 1 M | | | | | | M 1 II 1 1 1 1 1 1 || 

GGGAC AT GCT GAT GAAT CAACT GAGACACAAC T T CAAAT C C CTT AC AT C CT T TAGCAGAT 


995 


Db 


908 


967 


Qy 


996 


GAGCT GCT GGAT GCAGGT CTT C ACT CAGCCAAAA 102 9 

1 Ml III 1 1 1 1 1 1 1 1 1 1 1 1 1 

GG GC T CAT GAAC T C CT ACT T T CAT T C AGAGAAAA 1001 




Db 


968 





RESULT 5 
ABZ42542 

ID ABZ42542 standard; DNA; 1380 BP. 
XX 

AC ABZ42542; 
XX 

DT 04-MAR-2003 (first entry) 
XX 

DE Human purinergic receptor P2U2 nucleotide SEQ ID NO: 566. 
XX 

KW G protein-coupled receptor; GPCR; antigenic peptide; gene therapy; 

KW G protein-coupled receptor modulator; antibody; immune-related disease; 



KW growth-related disease; cell regeneration-related disease; AIDS; cancer; 

KW irnmunological-related cell proliferative disease; autoimmune disease; 

KW Alzheimer 1 s disease; atherosclerosis; infection; osteoarthritis; allergy; 

KW osteoporosis; cardiomyopathy; inflammation; Crohn's disease; diabetes; 

KW graft versus host disease; Parkinson's disease; multiple sclerosis; pain; 

KW psoriasis; anxiety; depression; schizophrenia; dementia; memory loss; 

KW mental retardation; epilepsy; asthma; tuberculosis; obesity; nausea; 

KW hypertension; hypotension; renal disorder; rheumatoid arthritis; trauma; 

KW ulcer; gene; ds . 

XX 

OS Homo sapiens . 
XX 

PN WO200261087-A2 . 
XX 

PD 08-AUG-2002. 
XX 

PF 19-DEC-2001; 2001WO-US050107 . 
XX 

PR 19-DEC-2000; 2000US-0257144P . 
XX 

PA (LIFE- ) LIFESPAN BIOSCIENCES INC. 
XX 

PI Burmer GC, Roush CL, Brown JP; 
XX 

DR WPI; 2003-046718/04. 

DR P-PSDB; ABP81696. 
XX 

PT New isolated antigenic peptides e.g., for G protein-coupled receptors 

PT (GPCR), useful for diagnosing and designing drugs for treating conditions 

PT in which GPCRs are involved, e.g. AIDS, Alzheimer's disease, cancer or 

PT autoimmune diseases. 

XX 

PS Disclosure; Fig 1; 523pp; English. 
XX 

CC The present invention describes antigenic peptides (I) comprising: (a) 

CC any one of 1601 sequences (see ABP82019 to ABP83619) of 12-24 amino 

CC acids. Also described: (1) an assay for the detection of a particular G 

CC protein-coupled receptor (GPCR) or a candidate polypeptide in a sample; 

CC and (2) an isolated antibody having high specificity and high affinity or 

CC avidity for a particular GPCR. (I) can be used as GPCR modulators and in 

CC gene therapy. The antigenic peptides for GPCRs are useful in detecting an 

CC antibody against a particular GPCR, and in the production of specific 

CC antibodies. The peptides and antibodies are also useful for detecting the 

CC presence or absence of corresponding GPCRs. The antigenic peptides for 

CC GPCRs and antibodies are useful for diagnosing and designing drugs for 

CC treating immune-related diseases, growth-related diseases, cell 

CC regeneration-related disease, immunological-related cell proliferative 

CC diseases, or autoimmune diseases, e.g. AIDS, Alzheimer's disease, 

CC atherosclerosis, bacterial, fungal, protozoan or viral infections, 

CC osteoarthritis, osteoporosis, cancer, cardiomyopathy, chronic and acute 

CC inflammation, allergies, Crohn's disease, diabetes, graft versus host 

CC disease, Parkinson's disease, multiple sclerosis, pain, psoriasis, 

CC anxiety, depression, schizophrenia, dementia, mental retardation, memory 

CC loss, epilepsy, asthma, tuberculosis, obesity, nausea, hypertension, 

CC hypotension, renal disorders, rheumatoid arthritis, trauma, ulcers, or 

CC any other disorder in which GPCRs are involved. The antibodies may be 

CC used in immunoassays and immunodiagnosis . ABZ42523 to ABZ42869 encode 



CC GPCR proteins given in ABP81675 to ABP82018, which are used in the 

CC exemplification of the present invention 

XX 

SQ Sequence 1380 BP; 383 A; 294 C; 274 G; 429 T; 0 U; 0 Other; 

Query Match 38.4%; Score 592.4; DB 7; Length 1380; 

Best Local Similarity 75.3%; Pred. No. 1.5e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2; 



Qy 


39 


G C AGAAT GG C ACAGAAT T T AT CT T GT G AGAAT T G GT T G G CAACAGAGG CT AT C T T GAAT A 

II 1 M 1 1 1 1 1 1 1 1 1 | | | | | | | | | | | | | | | | | | | | | | {MM 

GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 


98 


Db 


50 


109 


Qy 


99 


AGT AC T AC C T C T C T GC AT T T TAT GCAAT C GAGT T CAT T T T T G GACT GCT T GG GAAT GT C A 

1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 | | | | | M | M 

AGT AC T AC C T T T C CAT T T T T TAT GGGATT GAGT T C GT T GT G GGAGT C CT T GGAAAT AC C A 


158 


Db 


110 


169 


Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 
HI II 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | M 1 1 1 1 1 1 1 1 | Ml | | | || | | 
TT GT T GT T T AC G GCT AC AT CT T C T C T CT GAAGAAC T G GAACAGCAGTAAT AT T TAT CT C T 


218 


Db 


170 


229 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 

1 1 1 1 N 1 M 1 1 1 1 1 1 1 1 II Mill 1 M II II 1 1 II Mill M 1 1 1 1 1 II 1 M 

TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


230 


289 


Qy 


279 


AT GC CAAT GATAAG GGGAC C TAT GGAGAT GT T C T CT GT AT AAGCAAC C GAT AT GT GCT T C 

M 1 1 1 1 1 1 1 II III II 1 1 1 1 1 1 II 1 1 II 1 I I M II II II II II M 1 1 1 1 1 1 

AT GC CAAT G GAAAC T G GAT AT AT GGAGAC GT GC T CT GC ATAAG CAAC C GAT AT GT GC T T C 


338 


Db 


290 


349 


Qy 


339 


AC AC CAAC C T C T AC AC CAG CAT CCTCTTCCT C AC T T T CAT TAG CAT GGAC C GAT AT CT GC 

1 1 1 1 1 1 1 1 M 1 1 1 II M 1 1 1 II II II 1 1 1 1 1 1 II I || || || | | | | | || 

AT GC CAAC CT C TAT AC CAG CAT TCTCTTTCT C ACT T TT AT CAG CAT AGAT C GAT ACTT GA 


398 


Db 


350 


409 


Qy 


399 


T CAT GAAGT AC C CT T T C C GAGAACACT T T CT ACAAAAGAAGGAATT T GC C AT T T T AAT C T 

1 H 1 1 1 1 1 1 II 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 II || || || | | | m 1 M 1 II 

TAAT T AAGT AT C CT TT C C GAGAAC AC CT T CT GCAAAAGAAAGAGTT T GCT AT T T T AAT CT 


458 


Db 


410 


469 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 

1 1 1 1 1 1 1 M 1 II 1 1 1 1 II II II 1 1 II 1 1 1 1 1 1 | 

C CTT GG C C AT TT GG GT T T T AGTAAC CT T AGAGT T ACTAC CC AT ACT T C C C C T T ATAAAT C 


518 


Db 


470 


529 


Qy 


519 


UN N II 1 1 1 1 1 Mill 1 1 1 II 1 II M 1 II II II 1 II 1 

C T GT T ATAACT GAC AAT GGCAC C AC CT GTAAT GAT T T T GCAAGTT CT GGAGAC C C CAACT 


578 


Db 


530 


589 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 II II M 1 1 1 1 M 1 Ill 1 IMM 

AC7UVCCTCATTTACAGCATGTGTCT7VACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


590 


649 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 

1 1 1 1 N 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 I | | 

TGTGTTTCTTT TAT TAC AAGAT TGCTCTCTTCC TAAAGCAGAGGAAT AG GC AGGT T G CT A 


698 


Db 


650 


709 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

1 II Mill II MM MUM 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


710 


769 



Qy 



759 TACT C T T CAC AC C CT AT CAT AT CAT GC GCAAT T T GAG GAT C G CCT C AC G C C T GGAT AGT T 818 



Db 



770 T GCT T T T T AC AC C CT AT CAC GT CAT GC G GAAT GT GAG GAT C G CT T CAC G C C T GGG GAGT T 829 



Qy 



819 G G C C ACAAG GAT GT AC ACAGAAG GC CAT CAAAT C T AT AT AC AC AC T GACAC G G C C T C 8 75 

I I I II M I M I II I I I I I I I I I I I I I I I II U I I I 

830 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 88 9 



Db 



Qy 



876 TGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGGAGACCATTACA 935 




Db 



890 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 94 9 



Db 



Qy 




Qy 



996 GAGCT GCT GGAT GC AGGT C T T CACT C AG C CAAAA- T GAGACACT T GATAAAC AG 104 8 



Db 



1010 G G GC T CAT GAACT C CT AC T T T C ATT CAGAGAAAAGT GAG GG GCT T GT GAAAC AG 1063 



RESULT 6 
ABL90790 

ID ABL90790 standard; cDNA; 1436 BP. 
XX 

AC ABL90790; 
XX 

DT 24-MAY-2002 (first entry) 
XX 

DE Human polynucleotide SEQ ID NO 1352. 
XX 

KW Cytostatic; immunosuppressive; nootropic; neuroprotective; antiviral; 

KW antiallergic; hepatotropic; antidiabetic; antiinflammatory; antiulcer; 

KW vulnerary; anticonvulsant; antibacterial; antifungal; antiparasitic; 

KW cardiant; gene therapy; cancer; immune disorder; cardiovascular disorder; 

KW neurological disease; infection; human; secreted protein; gene; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200190304-A2. 
XX 

PD 29-NOV-2001. 
XX 

PF 18-MAY-2001; 2 001WO-US01 64 50 . 
XX 

PR 19-MAY-2000; 2000US-0205515P . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Birse CE, Rosen CA; 
XX 

DR WPI; 2002-122018/16. 

DR P-PSDB; ABB90381. 
XX 

PT Novel 1405 isolated polypeptides, useful for diagnosis, treatment and 

PT prevention of neural, immune system, muscular, reproductive, 

PT gastrointestinal, pulmonary, cardiovascular, renal and proliferative 



cc 
cc 
cc 



PT disorders. 
XX 

PS Claim 4; SEQ ID NO 1352; 2081pp + Sequence Listing; English. 
XX 

CC The invention relates to novel genes (ABL89449-ABL90853 ) and proteins 
CC (ABB89040-ABB90444) useful for preventing, treating or ameliorating 

CC medical conditions e.g. by protein or gene therapy. The genes are 

CC isolated from a range of human tissues disclosed in the specification. 

CC The nucleic acids, proteins, antibodies and (ant ) agonists are useful in 

CC the diagnosis, treatment and prevention of: (a) cancer, e.g. breast and 

CC ovarian cancer and other cancers of the adrenal gland, bone, bone marrow, 

CC breast, gastrointestinal tract, liver, lung, or urogenital; (b) immune 

CC disorders e.g. Addison's disease, allergies, autoimmune haemolytic 

CC anaemia, autoimmune thyroiditis, diabetes mellitus, Crohn's disease, 

CC multiple sclerosis, rheumatoid arthritis and ulcerative colitis; (c) 

CC cardiovascular disorders such as myocardial ischaemias; (d) wound healing 

CC ; (e) neurological diseases e.g. cerebral anoxia and epilepsy; and (f) 
infectious diseases such as viral, bacterial, fungal and parasitic 
infections. Note: The sequence data for this patent did not form part of 
the printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published pet sequences 
XX ~ 

SQ Sequence 1436 BP; 397 A; 309 C; 289 G; 441 T; 0 U; 0 Other; 

Query Match 38.4%; Score 592.4; DB 6; Length 1436; 

Best Local Similarity 75.3%; Pred. No. 1.6e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2 

QY 39 GCAGAATGGCACAGAATTTATCTTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATA 98 

M I I I I I I N I I I I I I I I II I I I I I I I I | | | | | | | | I | | | I 
Db 100 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 159 

QY 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M | I I I I I I I I II 

Db 160 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 219 

Qy 1^9 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

I I I II I I I I I I I I I I I I I I | | | | | | | | | | | M I I I I I I III I MMI I 
Db 22 0 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 279 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 27 8 

I I I I M I II I I I I II I I I I I I I II I I I I I I I I I I I HIM I I I I I I I Mill 
Db 280 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 339 

Qy 27 9 ATGCCAATGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTC 338 

' I I I I I II I II Ml I I I II I I I II I I I II I I || I I I M II I I I I I M M II 
Db 34 0 ATGCCAATGGAAACTGGATATATGGAGACGTGCTCTGCATAAGCAACCGATATGTGCTTC 399 

Qy 339 ACACCAACCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCATGGACCGATATCTGC 398 

I I I I I I I M M I II M I I I II || I || II I I I I I I I I I || || | || M II 
Db 4 00 ATGCCAACCTCTATACCAGCATTCTCTTTCTCACTTTTATCAGCATAGATCGATACTTGA 459 

Qy 399 TCATGAAGTACCCTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCT 458 

' II I I I I I I I I I I I II I I | | || | MM || | || | | | || Mill I I I I I II I I I 

Db 460 T AAT T AAGTAT C C T T T C C GAGAAC AC CT T CT GCAAAAGAAAGAGT T T G CT AT T T TAAT CT 519 



Qy 



4 59 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 



Db 


520 


■ 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I | | 1 1 1 | 1 | 
C C T T GGC C AT T T G G GT T T T AGT AAC C T T AGAGT T ACT AC C CAT AC T T CC C C T T ATAAAT C 


579 


Qy 


519 


C T GT C C CAAAAGAAGAGGG CAGT AAC T G CAT C GACTAT G CAAGT T CT GGAAAC C C T GAAC 
1111 II M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II I I | | | | | M 1 II 1 1 
C T GT TAT AAC T GACAAT G GCAC C AC CT GT AAT GAT T T T G CAAGT T CT GGAGAC C C CAACT 


578 


Db 


580 


639 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

1111 Minimum i i i i i i i i i i i i i i i i i i i minim i i m i i 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


640 


699 


Qy 


639 


T GT GCTT CTT CT ACT ACAAGAT GGT AGT CTT CTTAAAGAGGAGGAGC CAGCAGCAAGCAA 
1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 || | | | | | Mill MM III 
TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


700 


759 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 Mill II 1 1 1 1 II M 1 M 1 1 1 1 1 1 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


760 


819 


Qy 


759 


TACT CTT C AC AC C C TAT CAT AT CAT GC GCAAT T T GAGGAT C GC C T C AC GCCT G GATAGT T 
1 I' II 1 1 1 1 1 1 1 1 M 1 1 II II 1 1 Ml 1 1 1 1 1 1 1 1 1 1 || | M 1 1 1 1 1 1 1 1 I 
TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 


818 


Db 


820 


879 


Qy 


819 


G G C CACAAGGAT GT AC ACAGAAGGC CAT CAAAT CT AT AT AC ACACT GAC AC GGC CT C 

1 II 1 II 1 1 1 1 1 1 M II 1 1 1 1 Mill M 1 II 1 II 1 1 1 
GGAAGCAGTAT CAGT GCACTCAGGTCGT CAT CAACT CCTTTTACATTGTGACACGGC CTT 


875 


Db 


880 


939 


Qy 


876 


TGGCCTTTCT GAACAGT GC C AT CAAT C C CAT CTT CTACT T C C T CAT GG GAGAC CAT T AC A 
1 1 1 1 1 1 1 1 M II 1 1 1 II 1 II II II II 1 1 1 1 1 1 1 II II 1 II 1 II 1 II 1 II 
TGGCCTTTCT GAACAGT GTCATCAACCCTGT CTT CTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


940 


999 


Qy 


936 


GAGAGAT GCT GAT T AGT AAGTT C AGACAAT ACT T CAAGT C C CT T AC AT C CT T CAGGACAT 

1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 J l l l r ■ i i ■ i • 

1 II 1 1 1 1 1 M 1 1 1 1 1 Mill 1 1 II 1 1 1 M 1 II II 1 II II II II 1 II 

GG GAC AT GCT GAT GAAT CAACT GAGAC ACAACT T CAAAT C C CT T ACAT C C T TT AG CAGAT 


995 


Db 


1000 


1059 


Qy 


996 


GAGC T GC T GGAT GC AGGT CTT C ACT C AGC CAAAA- T GAGACAC TT GATAAACAG 104 8 

1 1 1 1 Ml 1 1 1 1 1 1 1 II M M 1 1 II MM M 1 M 1 

GG GCT CAT GAAC T C C T ACT T T CAT T C AGAGAAAAGT GAGGGGC TT GT GAAAC AG 1113 




Db 


1060 





RESULT 7 
ACC46165 

ID ACC46165 standard; cDNA; 1473 BP. 
XX 

AC ACC46165; 
XX 

DT 02-JUN-2003 (first entry) 
XX 

DE Human dithp receptor-encoding cDNA. 
XX 

KW Human; dithp; diagnostic and therapeutic polynucleotide; diagnosis; 

KW cancer; cell proliferative disorder; autoimmune disorder; 

KW inflammatory disorder; infection; hormonal disorder; metabolic disorder; 

KW neurological disorder; gastrointestinal disorder; transport disorder; 

KW connective tissue disorder; drug screening; proteome analysis; 

KW gene therapy; antisense therapy; genotyping; transgenic animal; knock in; 

KW disease model; toxicological testing; transcript imaging; receptor; gene; 



KW ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200297031-A2. 
XX 

PD 05-DEC-2002. 
XX 

PF 27-MAR-2002; 2002WO-US010056 . 
XX 

PR 28-MAR-2001; 2001US-0279619P . 

PR 29-MAR-2001; 2001US-0280067P . 

PR 29-MAR-2001; 2001US-0280068P . 

PR 16-MAY-2001; 2001US-0291280P . 

PR 17-MAY-2001; 2001US-0291829P . 

PR 17-MAY-2001; 2001US-0291849P . 

PR 19-JUN-2001; 2001US-0299428P . 

PR 20-JUN-2001; 2 001US-02 9977 6P . 

PR 20-JUN-2001; 2001US-0300001P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Daffo A, Jones AL, Tran AB, Dahl CR, Gietzen D, Chinn J; 

PI Dufour GE, Hillman JL, Yu JY, Tuason 0, Yap PE, Amshey SR; 

PI Daughtery SC, Dam TC, Liu TF, Nguyen DA, Kleefeld Y, Gerstin EH; 

PI Peralta CH, David MH, Lewis SA, Chen A J, Panzer SR, Harris B; 

PI Flores V, Marwaha R, Lo A, Lan RY, Urashka ME; 

XX 

DR WPI; 2003-129518/12. 

DR P-PSDB; ABR41222. 
XX 

PT Novel human diagnostic and therapeutic polypeptide useful for identifying 

PT test compound which specifically binds to a polypeptide encoded by human 

PT diagnostic and therapeutic polynucleotide, and to induce antibodies. 
XX 

PS Claim 2; SEQ ID NO 86; 591pp; English. 
XX 

CC The invention relates to novel human diagnostic and therapeutic 

CC polynucleotides designated dithp (ACC4 6080-ACC4 674 9 ) and to their encoded 

CC proteins (DITHP; ABR41136-ABR41812 ) . The invention also relates to 

CC polynucleotide sequences at least 90% identical to the dithp cDNA 

CC sequences of the invention; recombinant vectors, host cells and 

CC transgenic organisms comprising a dithp nucleic acid sequence; the 

CC recombinant production of DITHP proteins; antibodies specific for DITHP 

CC proteins; microarrays comprising dithp nucleic acid sequences; methods of 

CC detecting dithp nucleotide and protein sequences; methods of screening 

CC for compounds which specifically bind a DITHP protein; and methods of 

CC assessing the toxicity of test compounds using a dithp hybridisation 

CC probe. Dithp nucleic acid sequences and DITHP proteins may be used in the 

CC diagnosis of a wide variety of conditions including cancer and other cell 

CC proliferative disorders; autoimmune or inflammatory disorders; bacterial, 

CC viral, fungal or parasitic infections; hormonal disorders; metabolic 

CC disorders; neurological disorders; gastrointestinal disorders; transport 

CC disorders; and connective tissue disorders. They may also be used to 

CC screen for modulators of protein activity or gene expression. DITHP 

CC proteins can additionally be used in analysis of the proteome of a tissue 

CC or cell type and to induce antibodies. The dithp nucleic acids are 



additionally useful in somatic or germline gene therapy of the disorders 
mentioned above, as a source of antisense sequences, as a source of 
probes and primers, in genotyping and identification of individuals, in 
the generation of transgenic animal models of human disease or knock in 
humanised animals, in toxicological testing, and in transcript imaging. 
The present sequence represents a dithp cDNA encoding a DITHP protein 
which has receptor activity. Note: The sequence data for this patent did 
not form part of the printed specification, but was obtained in 
electronic format directly from WIPO at 
f tp . wipo . int/pub/published_pct_sequences 

Sequence 1473 BP; 403 A; 320 C; 303 G; 447 T; 0 U; 0 Other; 

Query Match 38.4%; Score 592.4; DB 7; Length 1473; 

Best Local Similarity 75.3%; Pred. No. 1.6e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps ; 



Qy 


39 


GC AGAAT GG CAC AGAAT T TAT CT T GT GAGAAT T GGT T GGCAAC AGAGGCT AT CT TGAAT A 

II 1 1 N 1 1 II 1 1 Mill 1 1 1 1 1 1 I || | | | || || | m Mill 

GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 


98 


Db 


119 


178 


Qy 


99 


AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 

1 N 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II I | | | || II 1 1 1 II 1 1 1 

AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 


158 


Db 


179 


238 


Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 
Ml II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 III I | | | | | | 
TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 


218 


Db 


239 


298 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 
1 M 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 M | | | | | | M 1 1 1 1 1 1 1 1 1 
TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


299 


358 


Qv 


279 


- rt ~ L ^^^nrti i/ii [ji ILILI GTATAAGCAACCGATATGTGCTTC 

1 1 1 1 1 1 1 M M 1 1 1 1 II M II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 

AT GC CAAT GGAAACT GGATAT AT GGAGAC GT GCT CT GC ATAAGCAAC C GAT AT GT GCT T C 


338 


Db 


359 


418 


Qy 


339 


AC AC CAAC CT CT AC AC C AGC AT CCTCTTCCT CAC T T T C ATTAGCAT GGAC C GAT AT CT GC 

1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 II 1 M M 

AT G C CAAC CT CT AT AC C AGCAT TCTCTTTCT CAC T T T TAT C AG CAT AGAT C GAT ACT T GA 


398 


Db 


419 


478 


Qy 


399 


T CAT GAAGT AC C CT TT C C GAGAAC ACTT T C T ACAAAAGAAG GAAT TT GC C AT T T T AAT C T 

1 M Mill 1 1 1 II 1 1 1 1 1 1 1 1 1 I Ml 1 1 Ml 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 M 

TAATTAAGT AT C CT TT C C GAGAACAC C T T C T GCAAAAGAAAGAGT TT GCT AT T TT AAT CT 


458 


Db 


479 


538 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACC CAT GCTCACTTT CAT CAAT T 

1 1 1 1 1 1 M II 1 1 1 1 1 1 1 II M | | M | | M | | M | | | | | | | | 

C CT T G GC C AT T T GG GT TT T AGTAAC CT T AGAGT T AC TAC C CAT ACTT CC C CT T AT AAAT C 


518 


Db 


539 


598 


Qy 


519 


C T GT C C CAAAAGAAGAGG GC AGTAACT GC AT C GAC TAT G C AAGT T CT GGAAAC C CT GAAC 

1 1 1 1 II II 1 1 M 1 1 II 1 1 1 M 1 1 1 1 1 II 1 1 1 1 

C T GT T ATAACT GACAAT GGCAC CAC CT GTAAT GAT T T T G C AAGT T CT GGAGAC C C CAAC T 


578 


Db 


599 


658 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 II 1 1 II MUM MM MUM 1 1 1 1 1 1 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


659 


718 



Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 
" 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 | | M 1 1 II 1 1 1 1 1 M | | | | Ml 
TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


719 


778 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 


758 


Db 


779 


M 1 1 1 1 M 1 II 1 1 II 1 1 1 I I I I I | | | | | | | | | | || | | | | | | | 
CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


838 


Qy 


759 


T AC T C T T C AC AC C CT AT CAT AT CAT G C G C AAT T T GAG GAT C G C C T C AC G C C T G G AT AGT T 

1 ' 1 II 1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I III 1 1 1 1 M 1 1 II 1 I I I | M 1 1 1 1 1 1 1 

TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 


818 


Db 


839 


898 


Qy 


819 


G GCCACAAGGAT GTACACAGAAGGCCAT CAAATCTATATACAC ACT GACACGGCCT C 

1 I' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I | 1 I I I I I I | | | | 
GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 


875 


Db 


899 


958 


Qy 


876 


TGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGGAGACCATTACA 
M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II I I M | | | | | | | | | | | | | | | | | 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935. 


Db 


959 


1018 


Qy 


936 


GAGAGAT GCTGATTAGTAAGTT CAGACAATACTTCAAGT C CCTTACAT CCT T CAGGACAT 
1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 I 1 1 II 1 1 1 1 1 1 1 1 1 1 l ll i i i 

1 11 ' 1 1 ' ' 1 1 1 1 1 l l t l 1 1 1 1 1 1 f 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 | | | || | (I 

GGGACAT GC T GAT GAAT CAACT GAGAC ACAACT T CAAAT C C CTT AC AT C CT T T AGC AGAT 


995 


Db 


1019 


1078 


Qy 


996 


GAGCT GCT GGATGCAGGT CT TCACT CAGCCAAAA- T GAGACACTTGATAAACAG 104 8 




Db 


1079 


1 1 1 1 Ml 1 1 1 II 1 1 1 1 1 1 M 1 II 1 MM 1 M 1 1 1 

GGGCTCATGAACTCCTACTTTCATTCAGAGAAAAGTGAGGGGCTTGTGAAACAG 1132 





RESULT 8 
AAD24958 

ID AAD24958 standard; cDNA; 1542 BP. 
XX 

AC AAD24958; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Human G-protein coupled receptor-3 (GCREC-3) cDNA. 
XX 



KW 
KW 



Human; G-protein coupled receptor-3; GCREC-3; therapy; cancer; stroke; 
cell proliferative disorder; neurological; epilepsy; Parkinson's disease; 

KW Alzheimer's disease; inflammation; thyroiditis; haemolytic anaemia; AIDS; 

KW Acquired Immune Deficiency Syndrome; dementia; nootropic; cholelithiasis; 

KW multiple sclerosis; atherosclerosis; angina pectoris; gastroenteritis; 

KW diabetes; ulcer; viral infection; immunosuppressive; ss. 
XX 

OS Homo sapiens . 
XX 

FH K ^y Location/Qualifiers 

FT CDS 63. .1202 

FT /*tag= a 

FT /product= "Human GCREC-3 protein" 
XX 

PN WO200198351-A2. 
XX 

PD 27-DEC-2001. 
XX 



PF 15-JUN-2001; 2001WO-US019275 . 
XX 

PR 16-JUN-2000; 2000US-02 124 83P . 

PR 22-JUN-2000; 2000US-0213954P . 

PR 29-JUN-2000; 2000US-02 15209P . 

PR 07-JUL-2000; 2000US-0216595P . 

PR 14-JUL-2000; 2000US-0218936P . 

PR 19-JUL-2000; 2000US-02 19154P . 

PR 21-JUL-2000; 2000US-0220141P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Lai P, Baughn MR, Hafalia AJA, Nguyen DB, Gandhi AR, Kallick DA; 

PI Griffin JA, Yue H, Khan FA, Patterson C, Lu DAM, Tribouley CM; 

PI Lu Y, Walia NK, Graul R, Yao MG, Yang J, Ramkumar J, Au-Young J; 

PI Elliott VS, Hernandez R, Walsh RT, Borowsky ML, Thornton M, He A- 
XX 

DR WPI; 2002-075627/10. 

DR P-PSDB; AAE15633. 
XX 

PT Isolated human G-protein coupled receptor polypeptides and the use of 

PT these sequences in the diagnosis, treatment and prevention of diseases 

PT and in the assessment of exogenous compounds on the expression of the 

PT receptors. 
XX 

PS Claim 11; Page 133; 143pp; English. 
XX 

CC The invention relates to isolated human G-protein coupled receptor 

CC (GCREC) polypeptides and their biologically active fragments. GCREC and 

CC protein is useful in treating a disease or condition associated with an 

CC increase or decrease in expression of functional GCREC. The GCREC 1 s are 

CC useful in the diagnosis, treatment and prevention of cell proliferative 

CC disorders (cancer, leukaemia, melanoma); neurological disorders (stroke, 

CC epilepsy, Parkinson's disease, dementia, Alzheimer's disease); autoimmune 

CC inflammatory disorder (thyroiditis, haemolytic anaemia, AIDS, multiple 

CC sclerosis); cardiovascular disorder (atherosclerosis, angina pectoris ) , 

CC gastrointestinal disorder (ulcer, cholelithiasis, gastroenteritis), 

CC metabolic disorders (diabetes); viral infections (herpes virus) and in 

CC the assessment of the effects of exogenous compounds on the expression of 

CC the nucleic acid and amino acid sequences. The present sequence is human 

CC GCREC- 3 cDNA 
XX 

SQ Sequence 1542 BP; 428 A; 327 C; 315 G; 472 T; 0 U; 0 Other; 

Query Match 38.4%; Score 592.4; DB 6; Length 1542; 

Best Local Similarity 75.3%; Pred. No. 1.6e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2; 

QY 39 GC AGAAT GGC ACAGAAT T TAT C T T GT GAGAAT T GGT T GG CAAC AGAGG CT AT C TT GAAT A 98 

11 I I I I I I I I I I II I M I I I M I I I I I I I I I I I I I 

Db 205 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 264 

99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

Ml IM | , I I I I I I II I I I I I I II 

Db 265 AGTAC T AC CT T T CC AT T TT T TAT G GGATT GAGTT C GT T GT GG GAGT C CT T GGAAAT AC C A 32 4 



Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 



Db 


325 


Ml 1 II 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 | Ml | Mill 1 

TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 


384 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 
NIMH II II 1 1 II 1 1 I I | | | | | | | | | | | | | | | | | | | | | | Mm 

TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


385 


444 


Qy 


279 


AT G C CAAT GAT AAGGGGAC C TAT GG AGAT GT T CT CT GT AT AAG C AAC C GAT AT GT GCT T C 

M II II 1 1 1 II III 1 1 1 1 1 1 1 1 II 1 1 | | | Ml || | m | | || | | | || 

AT G CCAAT G GAAACT G GAT AT AT GGAGAC GT G CT C T GCAT AAGC AAC C GAT AT GT GCT T C 


338 


Db 


445 


504 


Qy 


339 


AC AC C AAC CT CT AC AC CAG CAT CCTCTTCCT C AC T T T C ATTAG CAT G GAC C GAT AT C T GC 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 | I | | 1 I I | | | M || | | M 1 M Mill II 

AT GCCAAC CT C TAT AC CAG CAT TCTCTTTCT C AC T T T TAT CAG CAT AGAT C GAT ACT T GA 


398 


Db 


505 


564 


Qy 


399 


T CAT GAAGT AC C CT TT C C GAGAACACT T T CT ACAAAAGAAGGAATT T GC CAT T TT AAT CT 

1 II Mm 1 M 1 II 1 1 1 1 II II 1 MM 1 II 1 II II II 1 1 II 1 II 1 1 II II II 

TAATT AAGT AT C CT TT C C GAGAACAC CT T CT G CAAAAGAAAGAGTT T GCT AT T T TAAT CT 


458 


Db 


565 


624 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 
1 MM Mill 1 II 1 1 II II II || 1 1 1 1 | | | | 1 II 1 | M Ml 
C CT T GGC C AT T T G GGT T T T AGTAAC CT T AGAGTT ACT AC C CAT ACT T C C C CT T ATAAAT C 


518 


Db 


625 


684 


Qy 


519 


CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 
>IH II II Mill 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 II M 1 
CTGTTATAACTGACAATGGCACCACCTGTAATGATTTTGCAAGTTCT GGAGAC CCCAACT 


578 


Db 


685 


744 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

Mil 1 1 M M 1 1 1 1 II 1 M II M II II 1 1 1 Mill II II 1 M 1 1 Mill 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


745 


804 


Qy 


639 


TGTGCTTCTTC T ACT ACAAGAT GGT AGT C T T CT T AAAGAGGAGGAGC CAG C AGCAAG CAA 

1 M 1 1 1 1 II II 1 || | || 1 II II 1 II II II I M II II I 

TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


805 


864 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

M 1 1 1 M M II II M II 1 1 II 1 1 II | | | | || 1 1 M II 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


865 


924 


Qy 


759 


TACT CT T C ACAC C CT AT CAT AT CAT G C GCAAT T T GAGGAT C GC CT CAC GC CT GGATAGT T 

1 II M II 1 1 1 II M II II II 1 II III II 1 1 II II M 1 II M 1 II II II 1 1 

TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 


818 


Db 


925 


984 


Qy 


819 


G GC CACAAGGAT GTACACAGAAGGCCAT CAAAT CTATATACACACT GACACGGCCT C 

1 M 1 1 1 1 1 1 II 1 II II II 1 1 Mill M 1 II II 1 II 1 
GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 


875 


Db 


985 


1044 


Qy 


876 


T GGC C T T T CT GAAC AGT GC CAT CAAT C C CAT CT T C T AC TT C CT CAT G GGAGAC C ATTAC A 
1 M 1 II II II I I I I I || | || M 1 II 1 II II 1 II 1 II 1 II II II II II || 
TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


1045 


1104 


Qy 


936 


GAGAGAT GCT GAT T AGT AAGT T C AGAC AAT AC T T r A A r;T r PPTT a r a t r r mrr-ansT 

' 1 ' 1 1 M 1 1 1 1 M II 1 1 II 1 || 1 II 1 II 1 II II 1 1 

G G GAC AT GCT GAT GAAT CAAC T GAGAC AC AAC T T CAAAT C CCT T AC AT C CT T TAG CAGAT 


y y d 


Db 


1105 


1164 


Qy 


996 


GAGCTGCTGGAT GCAGGT CTT CACT C AGC CAAAA- T GAGACACTT GATAAACAG 1048 
1 III Ml 1 1 1 1 1 1 1 II II 1 II 1 II 1 II 1 II M II 





1165 GGGCT CAT GAACT C CT ACTTT CAT T CAGAGAAAAGT GAGGGGCTT GTGAAACAG 1218 



RESULT 9 
ABS57291 

ID ABS57291 standard; cDNA; 1338 BP. 
XX 

AC ABS57291; 
XX 

DT 30-JAN-2003 (first entry) 
XX 

DE cDNA encoding human adenosine receptor. 
XX 

KW Human; mammalian; adenosine receptor; G-protein coupled receptor; GPCR; 

KW adenosine-mediated medical condition; vasodilation; hypotension; 

KW reversal of tachycardia; chronic renal disease; thyroid disorder; 

KW inflammation; asthma; hypertensive; antiarrhythmic; antiinflammatory; 

KW antiasthmatic; gene; ss. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .1005 

FT /*tag= a 

FT /product= "Adenosine receptor" 

XX 

PN US2002137887-A1. 
XX 

PD 26-SEP-2002. 
XX 

PF 17-JAN-2001; 2 00 1US-007 65034 . 
XX 

PR 17-JAN-2001; 2001US-00765034 . 
XX 

PA (HEDR/) HEDRICK J A. 

PA (LACH/) LACHOWICZ J E. 

PA (WANG/) WANG W. 

PA (GUST/) GUSTAFSON E L. 

XX 

PI Hedrick JA, Lachowicz JE, Wang W, Gustafson EL; 
XX 

DR WPI; 2003-074992/07. 

DR P-PSDB; ABG72131. 
XX 

PT Novel isolated mammalian adenosine receptor polypeptide useful for 

PT identifying an agonist or antagonist of the receptor for treating 

PT vasodilation, hypotension, chronic renal diseases, thyroid disorders and 

PT inflammation. 

XX 

PS Example 1; Page 14-16; 19pp; English. 
XX 

CC The present invention relates to the isolation of a mammalian (human) 

CC adenosine receptor, and the polynucleotide sequence encoding it. The 

CC cloned receptor resembles a member of the G-protein coupled receptor 

CC (GPCR) superfamily that contains 7-transmembrane domains. The adenosine 

CC receptor is useful for identifying agonists and antagonists of the 

CC receptor, which may be useful for treating an adenosine-mediated medical 



CC condition. The adenosine receptor polypeptide sequence is also useful as 

CC an antigen to elicit antibody production in an immunologically competent 

CC host. An antibody which binds specifically to the adenosine receptor is 

CC useful for treating medical conditions caused or mediated by adenosine 

CC such as vasodilation, hypotension, reversal of tachycardia, chronic renal 

CC diseases, thyroid disorders and inflammation (e.g. asthma). The antibody 

CC can also be used to purify the adenosine receptor, or as a basis for 

CC immunoassays of the receptor. The polynucleotide sequence encoding the 

CC adenosine receptor is useful for producing vectors and host cells 

CC containing the vectors. It is also useful for measuring expression of a 

CC mammalian adenosine receptor gene in a biological sample. The present 

CC sequence encodes human adenosine receptor 
XX 

SQ Sequence 1338 BP; 370 A; 288 C; 265 G; 415 T; 0 U; 0 Other; 

Query Match 38.3%; Score 590.8; DB 7; Length 1338; 

Best Local Similarity 75.2%; Pred. No. 3.9e-139; 

Matches 763; Conservative 0; Mismatches 247; Indels 4; Gaps 2; 

QY 39 G CAGAAT G G CACAGAAT T TAT CT T GT GAGAATT GGT T G GCAACAGAGGC T AT C T T GAAT A 98 

I I MINI I I I I I I I I | I I I I I I I I I | | | | | | | | | | I | | | | 
Db 8 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

QY 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I | | | | | | | M II 

Db 68 AGT ACT AC CT TT C CAT T T T T TAT G GGAT T GAGT T C GT T GT GGGAGT C CT T GGAAATAC C A 127 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

HI II I Mill, I I I I II I I I I II I I I I I I I II I I I I Ml I I I I I I I 
Db 128 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 187 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

.'1:111 II I I I I I I I I I I Mill I I I I I I II II I I I I II M I II I I Mill 

Db 188 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 2 47 

Qy 27 9 ATGCCAATGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTC 338 

I I M II I M II III I I II I I I I II I I I M I I I I I I I I I || I II I I I II II I 
Db 248 AT GC CAAT G GAAACT G GAT AT AT GGAGAC GT GCT CT GC ATAAGCAAC C GAT AT GT GCT T C 307 

Qy 339 AC AC CAAC C T CT AC AC C AG CAT CCTCTTCCT CACT T T CAT T AGCAT GGAC C GAT AT CT G C 398 

I I I I I I I M I I M M M I I I I I I I I I I I I I II I I I I I II I I II I I I II 
Db 308 AT G C CAAC C T C TAT AC CAG CAT TCTCTTTCT C AC T T TT AT CAG CATAGAT C GAT AC T T GA 367 

Qy 399 T CAT GAAGT AC C C T TT CC GAGAAC ACT T T CT ACAAAAGAAGGAAT TT GC C ATT TT AAT CT 458 

I I II I I M I I I I I I I I I I I I I I I I || IMM I II I II II M 

Db 368 T AAT T AAGT AT C C T T T CC GAGAACAC CT T C T GCAAAAGAAAGAGT TT GCT ATT T T AAT CT 427 

Qy 459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

1 I I I I I I II || M II I I II I I I I I I I I 

Db 428 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 4 87 

Qy 519 CT GT C C CAAAAGAAGAGG GCAGT AAC T G CAT C GACT AT GC AAGT T CT GGAAAC C C T GAAC 57 8 

INI II II I II I I I II I I I I I I I || | | MMI I 

Db 48 8 C T GT T AT AACT GAC AAT G GC AC CAC CT GTAAT GAT T T T GCAAGT T C T GGAGAC C C CAAC T 54 7 



Qy 



579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 
I I I I I I I I M II II II I II M I I I I I II I I II I I I Mill 



Db 



54 8 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 607 



639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 69 8 

I I I I I I M I I I I I I I | I I I I I II I I I I | | | IMM I I I I III 

Db TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 667 

699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

IN I I II I I I I I I I I I II II MINIMI 

Db 668 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTT^ATCTTCTCTG 727 

Qy 759 TAC T CTT CACAC C CT AT CAT AT C AT GC GCAATTT GAGGAT C GC C T C AC GC CT GGAT AGT T 818 

II I I II II I I I I II I I I II I I I I || M I II I II II MM 

Db 728 TGCCTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 78 7 

Qy 819 G GCC ACAAGGAT GT ACACAGAAGGCCAT CAAAT CTATATACACACTGACACGGCCTC 875 

I II I I I I I I I I I I II I M I I Mill II I II I II I I I 

Db 788 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 847 

Qy 876 T GG CCT T T CT GAACAGT G C CAT CAAT C C CAT CTT CTAC T T C CT CAT G GGAGAC CAT TAC A 935 

I I I M I II I || || I I M M I I I I I I II M I I I I I I I 

Db 848 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 907 

Qy 936 GAGAGATGCTGATTAGTAAGTTCAGACAATACTTCAAGTCCCTTACATCCTTCAGGACAT 995 

I N I I II I I I I I I I I Mill I I M I I I I II I I II I I I I I I I I I I || 
Db 908 GGGACATGCTGATGAATCAACTGAGACACAACTTCAAATCCCTTACATCCTTTAGCAGAT 9 67 

Qy 996 GAGCTGCTGGATGCAGGTCTTCACTCAGCCAAAA-TGAGACACTTGATAAACAG 104 8 

I IN Ml I I I I I M I I II I I I I I I MM I I I II I 

Db 968 GGGCTCATGAACTCCTACTTTCATTCAGAGAAAAGTGAGGGGCTTGTGAAACAG 1021 



RESULT 10 
ACD27619 

ID ACD27619 standard; cDNA; 1428 BP. 
XX 

AC ACD27619; 
XX 

DT 18-SEP-2003 (first entry) 
XX 

DE Human ATP receptor cDNA. 
XX 

KW Human; ss; gene; ATP receptor; G-protein coupled receptor; gene therapy; 

KW 7-transmembrane receptor; asthma; allergic rhinitis; hypertension; ulcer; 

KW angina pectoris; allergy; psychosis; depression; migraine; vomiting; 

KW benign prostatic hypertrophy; arterial thrombosis; myocardial infarction; 

KW urinary retention; angioplasty; cystic fibrosis; Parkinson 1 s disease; 

KW acute heart failure; hypotension; thrombolysis; osteoporosis. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 91. .1096 

FT /*tag= a 

FT /product= "ATP receptor" 

XX 

PN US2003054487-A1. 
XX 



PD 20-MAR-2003 . 
XX 

PF 16-OCT-2002; 2 002US-00270587 . 
XX 

PR ll-JAN-1996; 96US-0009902P . 

PR 10-JAN-1997; 97US-007 81456 . 

PR 20-JUL-2001; 2001US-00908593 . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Li Y; 
XX 

DR WPI; 2003-540615/51. 

DR P-PSDB; ABU63309. 
XX 

PT New polynucleotide, useful for producing a medicament for treating 

PT asthma, allergic rhinitis or hypertension. 

XX 

PS Claim 1; Fig 1; 24pp; English. 
XX 

CC The invention relates to an isolated polynucleotide encoding a G-protein 

CC coupled, 7-transmembrane ATP receptor. The polynucleotide is useful for 

CC producing a medicament for treating asthma, allergic rhinitis or 

CC hypertension. Antagonists for the the ATP receptor can be used to treat 

CC angina pectoris, ulcers, allergies, psychoses, depression, migraine, 

CC vomiting, benign prostatic hypertrophy, arterial thrombosis, myocardial 
infarction, thrombolysis, angioplasty, cystic fibrosis. Agonists of the 
ATP receptor can be used to treat Parkinson's disease, acute heart 
failure, hypotension, urinary retention and osteoporosis. The present 

CC sequence represents cDNA encoding the human ATP receptor 
XX 

SQ Sequence 1428 BP; 394 A; 306 C; 290 G; 438 T; 0 U; 0 Other; 

Query Match 38.3%; Score 590.8; DB 8; Length 1428; 

Best Local Similarity 75.2%; Pred. No. 4e-139; 

Matches 763; Conservative 0; Mismatches 247; Indels 4; Gaps 2; 

Qy 39 GCAGAAT GGCACAGAATTTAT CTT GT GAGAATT GGTTGGCAACAGAGGCTAT CTT GAATA 98 

II I I I I II I II I I I I I I I I I I I | || | | | | | | | | | | | | M | I 
Db 99 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 158 

QY 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I M I I I II I I I I I I I I I I I I II I I I I I I I I | | | | | | | | M I II 

Db 159 AGT ACT AC CTT T C CAT T T TT T AT GG GATT GAGT T C GTT GT G G GAGT C C T T GGAAATAC C A 218 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

HI M I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I III I I I II I I 
Db 219 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 278 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I I I I I II M I I I I I I I I I I I I I I I I I M | M | | | | | | | | | | M I I I I I II 
Db 27 9 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 338 

Qy 279 AT GC CAAT GAT AAGGGGAC CT AT G GAGAT GT T C T CT GT ATAAGCAAC C GAT AT GT GCT T C 338 

I I I I I I I I I II IN I I II I I I I II I II I I I I I I I II I I I I I I I I I I I I I I I 
Db 339 AT GC CAAT G GAAACT GGAT AT AT GGAGAC GT GC T CT G C AT AAG CAAC C GAT AT GT G CT T C 3 98 



CC 
CC 
CC 



Qy 


339 


ACAC C AAC CT C T AC AC C AGC AT CCTCTTCCT CAC T T T C ATT AGC AT GGAC C GAT AT CT GC 

1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 11 1 I 1 1 1 1 1 1 1 I 1 1 1 II Mill II Mill II 

AT GC CAAC C T C TAT AC C AG CAT T CT C T T T CT CAC T T T TAT C AGC AT AGAT C GAT ACT T GA 


398 


Db 


399 


458 


Qy 


399 


T CAT GAAGT AC C CT T T C C GAGAACAC T T T CT AC AAAAGAAGGAAT T T G C CAT T T T AAT C T 

1 1 1 1 1 M 1 1 II II 1 M 1 1 II II 1 II 1 1 II II 1 1 II II | Ml 1 1 1 1 1 1 1 M 1 

TAATTAAGTATCCTTTCCGAGAACACCTTCTGCAAAAGAAAGAGTGTGCTATTTTAATCT 


458 


Db 


459 


518 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 
1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 II M 1 1 1 1 1 1 1 1 I II 1 1 1 1 1 M 
C CT T GGC C AT GT GG GT T T T AGT AAC C T T AGAGTT ACT AC C CAT AC T T C C C CT T AT AAAT C 


518 


Db 


519 


578 


Qy 


519 


CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 

1 1 1 1 1 1 1 II 1 1 1 1 1 II M 1 II 1 II 1 II II M 1 

CT GT T ATAACT GACAAT G G CAC CAC CT GTAAT GAT T TT G CAAGTT CT G GAGAC C C CAAC T 


578 


Db 


579 


638 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

MM 1 II 1 1 1 1 1 1 1 1 1 1 M II II 1 II 1 1 M II 1 1 1 1 M 1 M II 1 Mill 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


639 


698 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 

M M 1 1 Ml II 1 II 1 1 M Ml III 

TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


699 


758 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

M M 1 1 1 1 1 1 1 1 1 1 1 II 1 Mill II MM II II IMIIMM 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


759 


818 


Qy 


759 


TACT CT T CAC AC C CT AT CAT AT CAT GC GCAAT T T GAGGAT C GC CT C AC GC C T GGATAGT T 

1 M 1 1 M 1 M 1 M 1 1 1 1 1 1 1 1 II 1 II M 1 M 1 M 1 1 11 1 II 1 1 II 

TGCTTTTTACACCCTATCACGT CAT GCGGAATGT GAGGAT CGCTTCACGCCTGGGGAGTT 


818 


Db 


819 


878 


Qy 


819 


G GC CACAAGGAT GTACACAGAAGGCCAT CAAAT CTATATACACACT GACACGGCCT C 

1 M 1 M II 1 1 1 IMIIMM Mill M II II 1 II II 

GGAAGC AGT AT CAGT GC ACT CAGGT C GT CAT CAAC T CCTT TT ACAT T GT GAC AC GG C CT G 


875 


Db 


879 


938 


Qy 


876 


TGGCCTTTCT GAAC AGT GCCAT CAAT C C CAT CTT CTACTT C CT CAT G GGAGAC CAT T AC A 

M M 1 1 1 1 1 II 1 1 M II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 II II 1 1 1 M 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTGTGGGAGATCACTTCA 


935 


Db 


939 


998 


Qy 


936 


GAGAGAT GCT GATTAGTAAGTTCAGACAATACTT CAAGTCCCTTACATCCTT CAGGACAT 

1 II 1 1 1 1 1 1 1 1 t 1 1 1 Mill 1 1 1 1 1 1 I 1 1 1 1 l l i i i i i i i i ii t ii 

1 11 M I I 1 I I I I I | | | | | | | 1 1 1 1 I 1 1 M 1 II 1 1 II 1 II II II 1 M 
GGGAC AT GCT GAT GAAT CAACT GAGAC AC AACT T CAAAT C C CT T AC AT CCTT T AGCAGAT 


995 


Db 


999 


1058 


Qy 


996 


GAGC TGCT G GAT G CAGGT CTT CACT CAG C CAAAA- T GAGACAC T T GATAAAC AG 104 8 

IIM Ill 1 1 1 II 1 1 1 1 II 1 II 1 II MM MUM 

G GGCT C AT GAAC T C C T ACT T T CAT T CAGAGAAAAGT GAG GGG C TT GT GAAAC AG 1112 




Db 


1059 





RESULT 11 
AAT71900 

ID AAT71900 standard; cDNA; 1996 BP. 
XX 

AC AAT71900; 
XX 

DT ll-SEP-1997 (first entry) 
XX 



DE 
XX 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
XX 
PA 
XX 

PI 

XX 
DR 
DR 
XX 
PT 
PT 
PT 
XX 
PS 
XX 
CC 

cc 

CC 

cc 
cc 
cc 
cc 
cc 
cc 
cc 

XX 
SQ 



Human purinergic receptor P2U2 cDNA. 

P2U2 receptor; purinergic receptor; diagnosis; therapy; ss. 



Homo sapiens . 

Key 
CDS 



WO9720045-A2 , 
05-JUN-1997. 



Location/Qualifiers 
625. .1629 
/*tag= a 



08-NOV-1996; 96WO-US018175 . 



15-NOV-1995; 
15-NOV-1995; 



95US-0006782P. 
95US-00559524 . 



(CORT-) COR THERAPEUTICS INC. 

Conley PB, Jantzen H; 

WPI; 1997-310601/28. 
P-PSDB; AAW19854. 

New isolated purinergic receptor sub-type - used to develop products for 
diagnosis and therapy, e.g. for screening for agonists and antagonists 
which can modulate activation. 

Claim 3; Fig 1A-C; 36pp; English. 

A cDNA clone (AAT71900) codes for a novel human purinergic receptor 
subtype, designated P2U2 receptor (AAW19854), that is abundantly 
expressed in kidney and in many cell lines of megakaryocytic or 
erythroleukaemic origin and which is activated by ATP, UDP, UTP and UDP. 
The clone was obtd. by amplifying DAMI (ATCC CRL 9792) cell cDNA using 
primers (see also AAT72104-05) based on transmembrane regions of mouse 
P2u and chicken P2Y1 receptors, and use of the PCR product to screen the 
DAMI cDNA library to isolate the full-length clone. P2U2 nucleic acids 
can be used in the recombinant prodn. of P2U2 receptor polypeptides and 
as probes 

Sequence 1996 BP; 513 A; 454 C; 381 G; 647 T; 0 U; 1 Other; 



Query Match 38.2%; Score 589.2; DB 2 

Best Local Similarity 75.1%; Pred. No. 1.2e-138 
Matches 762; Conservative 0; Mismatches 248 



Length 1996; 

Indels 4; Gaps 2; 



Qy 



Db 



Qy 



Db 



39 G C AGAAT GGCAC AGAATT T AT CT T GT GAGAAT T G GT T GGC AACAGAGGC T AT C T T GAAT A 98 
II MINI I I I I | | | | | I I I I I I I I I | I | | I | | | | | | M | I 
632 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 691 

9 9 AGT AC T AC C T CT CT GC AT TT T AT G C AAT C GAGT T CAT T T T T G GAC T GCT T G GGAAT GT C A 158 
I I N I M I I I I I I I I I I I | I I I I I I I I I I I I I I | | | | | | | | | || 

692 AGT AC T AC C T T T C CAT T T TT T AT GG GAT T GAGT T C GT T GT GGGAGT C CT T G GAAATAC C A 751 



Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 
N 1 M 1 1 1 1 1 II 1 MMII 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 IN | | | | M | 
T T GT T GT T T AC G GC T AC AT CT T C T C T C T GAAGAAC T GGAAC AG C AGT AAT AT T TAT C T CT 


218 


Db 


752 


811 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 

1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I Mill 

TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


812 


871 


Qy 


279 


AT GC CAAT GAT AAGGG GAC C TAT GGAGAT GT T C T C T GT AT AAGC AAC C GAT ATGT GC T T C 

N II 1 1 1 II II II 1 II II 1 1 1 1 II 1 1 1 1 1 1 II II 1 1 1 II I II M | II 1 II 1 

AT GC CAAT GGAAACT GGAT ATAT GGAGAC GT GCT CT GCATAAGCAAC C GATATGTGCTT C 


338 


Db 


872 


931 


Qy 


339 


ACAC CAAC CT C T AC AC C AG CAT CCTCTTCCT C AC T T T CAT T AGC AT GGAC C GAT AT CT G C 

1 1 1 1 1 1 1 1 1 1 1 M 1 II II 1 II II 1 1 II 1 1 II 1 II II II 1 II II 1 M M 

AT GC CAAC CT C TAT AC C AGC ATT CTCTTTCT C AC T T T TAT C AG C AT AGAT C GAT ACTT GA 


398 


Db 


932 


991 


Qy 


399 


T CAT GAAGT AC C C T T T C C G AGAACACT T T CT ACAAAAGAAGGAAT T T GC C AT TT TAAT CT 

1 M 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 II 1 1 1 II 1 1 1 M 1 1 1 II 1 II 1 1 II 1 1 II 1 1 II 

TAAT TAAGT AT C CT T T C C GAGAACAC C T T CT GCAAAAGAAAGAGT TT G CT AT T T TAAT CT 


458 


Db 


992 


1051 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 
1 1 1 1 1 Mill 1 II II 1 II II 1 M 1 1 1 1 II II 1 II 1 1 1 1 1 1 1 
CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 


518 


Db 


1052 


1111 


Qy 


519 


CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 

INI II M 1 II II 1 II 1 1 II 1 1 1 1 II II M II II II 1 1 1 

C T GT T AT AACT GACAAT GG C AC CAC C T GTAATGAT T T T GCAAGT T CT GGAGAC C C CAAC T 


578 


Db 


1112 


1171 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 
INI II 1 1 1 1 II II II 1 II II II I || || 1 1 1 1 II 1 II 1 M 1 M 1 1 1 1 1 1 
ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


1172 


1231 


Qy 


639 


TGTGCTTCTT CT ACT ACAAGAT G GT AGT C T T C T TAAAGAG GAGGAGC C AG C AG C AAG CAA 

N 1 1 II 1 1 1 II II II II II 1 1 1 1 II 1 II 1 1 M M 1 MM Ml 

TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


1232 


1291 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

I'll INN 1 M 1 Mill M MM II II MIIIIMI 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


1292 


1351 


Qy 


759 


T AC T CT T CAC AC C CT AT CAT AT CAT GC GCAAT T T GAGGATC GC CT CAC GC CT GGAT AGT T 

1 N II II M II 1 1 II 1 II 1 II 1 1 M 1 II 1 1 1 1 II 1 1 1 1 1 II 1 II II II II 

TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 


818 


Db 


1352 


1411 


Qy 


819 


G G C C ACAAG GAT GT ACAC AGAAGG C CAT CAAAT CT AT ATAC ACACT GAC AC GG C CT C 

> N 1 II II II 1 1 1 M II 1 II 1 II 1 1 II II 1 1 1 1 1 1 
GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGGCTT 


875 


Db 


1412 


1471 


Qy 


876 


TGGCCTTTCT GAAC AGT GC C AT CAAT C C CAT CT T CT ACT T C CT CAT GGGAGAC CAT T AC A 

N 1 N M II II II II II II II II 1 1 II 1 II II II II || M II 1 1 1 1 II 

TGGGCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


1472 


1531 


Qy 


936 


GAGAGAT GCT GAT T AGTAAGTT CAGACAAT ACT T CAAGT C C C T T AC AT C CTT CAG GAC AT 

Ml 1 II 1 II II M 1 1 1 II II II 1 Ill 

G G GAC AT GCT GAT GAAT C AACT GAGAC AC AAC T T CAAAT C C C T T AC AT C CT T TAG C AGAT 


995 


Db 


1532 


1591 


Qy 


996 


GAG CT G CT G GAT G C AG GT CTT CACT CAG C C AAAA- T GAG ACAC T T GAT AAAC AG 1048 





1592 GGG CT CAT GAACT C C TACT T T CAT T C AGAGAAAAGT GAGG G G C T T GT GAAAC AG 1645 



RESULT 12 
AAT75146 

ID AAT75146 standard; cDNA; 1428 BP. 
XX 

AC AAT75146; 
XX 

DT 07-OCT-1997 (first entry) 
XX 

DE Human ATP receptor cDNA. 
XX 

KW ATP receptor; G-protein coupled receptor; agonist; antagonist; ss. 
XX 

OS Homo sapiens . 



XX 

FH Key Location/Qualifiers 

FT CDS 92. .1096 

FT /+tag= a 

FT /transl_except= (pos:725. .727, aa:Ser) 

FT /transl_except= (pos:764. .766, aa:Ser) 

FT /transl_except= (pos:820. .822, Xaa) 

FT /note= "Xaa = unknown" 

FT primer_bind complement ( 92 . .109) 

FT /*tag= c 

FT /note= "binding site for primers used to amplify cDNA for 

FT bacterial or COS expression" 

FT primer_bind complement ( 92 . .100) 

FT /+tag= b 

FT /note= "binding site for primer used to amplify cDNA for 

FT baculovirus expression" 

FT primer_bind 1076. .1095 

FT /*tag= d 

FT /note= "binding site for primer used to amplify cDNA for 

FT COS expression" 

FT primer_bind 1079. .1096 

FT /*tag= e 

FT /note= "binding site for primer used to amplify cDNA for 

FT bacterial expression" 

FT primer_bind 1085. .1096 

FT /*tag= f 

FT /note= "binding site for primer used to amplify cDNA for 

FT baculovirus expression" 

XX 



PN W09724929-A1. 
XX 

PD 17-JUL-1997. 
XX 

PF ll-JAN-1996; 9 6WO-US0 00392 . 
XX 

PR ll-JAN-1996; 96WO-US000392 . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Li Y; 



I 



DR WPI; 1997-372505/34. 

DR P-PSDB; AAW22732. 
XX 

PT Isolated human ATP receptor - agonists and antagonists of which are 

PT useful in treatment of, e.g. asthma, hypertension, arterial thrombosis 

PT and psychotic and neurological disorders. 
XX 

PS Claim 7; Fig 1A-C; 53pp; English. 
XX 

CC A cDNA clone (AAT75146) codes for human ATP receptor (AAW22732), a 

CC polypeptide structurally related to the G protein-coupled receptor 

CC family. It was discovered in a human placenta cDNA library. cDNA encoding 

CC the mature receptor, deposited as ATCC 97333, can be expressed in 

CC bacterial (e.g. E. coli), mammalian (e.g. COS) or insect (e.g. Sf9) host 

CC cells and used to screen for agonists and antagonists useful in the 

CC treatment of a variety of disorders. It can also be used to identify a 

CC mutation in an ATP receptor gene and thus to diagnose diseases, or 

CC susceptibility to diseases, related to ATP receptor underexpression 

XX 

SQ Sequence 1428 BP; 394 A; 308 C; 290 G; 435 T; 0 U; 1 Other; 

Query Match 38.1%; Score 587.2; DB 2; Length 1428; 

Best Local Similarity 75.0%; Pred. No. 3.2e-138; 

Matches 760; Conservative 1; Mismatches 249; Indels 4; Gaps 2 

Qy 39 GC AGAAT GGCAC AGAAT T TAT C TT GT GAGAATT GGT T GGCAAC AGAG GCT AT CT T GAAT A 98 

I I MINI I I I I I I I I I I I I I I I I I I | | M I I I I I I I I I I I 
Db 99 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 158 

QY 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I M I I I I I I II I II I | | | II I I II II II I II I I I I I I I Ml II 

Db 1^9 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 218 

Qy 1^9 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

Ml M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I Mill | 
Db 219 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 278 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

NIMH II I I I I II I I I I I I I I I I I I || I I I II I Mill II I M II II II I 

Db 279 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 338 

Qy 279 ATGCCAATGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGC/^ACCGATATGTGCTTC 338 

I I I I I I I I I M M I M I II I I I II Mill I I I II M II II I II || | | I | M 
Db 339 AT G C CAAT G GAAACT GGAT AT AT GGAGAC GT GCT CT GC ATAAGCAAC C GAT AT GT GCT T C 398 

Qy 339 ACACCAACCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCATGGACCGATATCTGC 398 

I I I I I M II I I I II I M II Mill I I I I I I II II Mill II II II I II 

Db 399 AT G C CAAC CT CT AT AC CAGC AT TCTCTTTCT CACT T T TAT C AGCATAGAT C GAT AC T T GA 458 

Qy 399 T CAT GAAGT AC C CT TT CC GAGAAC AC TT T CT AC AAAAGAAG GAAT TT G C CAT T T TAAT CT 458 

I ' I I I I I I I I I I II II II I I II II I I II I II II I || I II I II II II I I M 
Db 459 TAAT T AAGT AT C CTT T C C GAGAAC AC C TT CT G CAAAAGAAAGAGT GT G C TAT T T TAAT CT 518 

Qy 459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I I I M I 1 II I I II II II I I I II I I II I I I I | | | | | M 
Db 519 CCTTGGCCATGTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATA7^ATC 578 



Qy 519 CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 578 

INI II || I I I I I I I I I I I I I I I I I M I I I II I I || | | | 
Db 57 9 C T GTT AT AACT GACAAT GGC AC C AC CT GT AAT GAT T T T GCAAGT T C T G GAGAC C C CAACT 638 

579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCT7VATTCCTCTCTCTGTGA 638 
I I I I I I I I II I I I I I I I I I II || | | I I I I I Mill I I I I I I I I I Mill 
Db 639 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 698 

Qy 639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 

N M M I II II II I II II I I I II I I M II I II II I || | Ml 

Db 699 TGTGTTTCTTT TAT TACAAGAT T GC C T C C T T C CT AAAG CAGAGGAAT AGGCAGGT T G CT A 758 

Qy 699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I I N I I II M II M II I Mill I I M I I II M I II II II II 

Db 759 CTGCCTCGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 818 

Qy 759 TACT C T T CAC AC C CT AT CAT AT CAT GC G CAAT T T GAGGAT C GC CT C AC G C CT G GATAGT T 818 

I M II II II II M II I II I I II I Ml I II I I I I I II II II II I I II MM 

Db 819 TGCYTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 87 8 

Qy 819 G G C C ACAAG GAT GT AC AC AGAAGG C CAT CAAAT CTAT AT ACACAC T GAC AC GGC CT C 875 

I N I I I I I I I I I II I I II M Mill I I II II I II I I 

Db 879 GGAAGCAGTATCAGTGCACTCAGGTCGT CAT CAACT CCTTTTACATTGTGACACGGCCTG 938 

Qy 87 6 TGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGGAGACCATTACA 935 

M I I M I I I I I I I I I I II II M I I M I II II I II I II II I I I II I I II I 
Db 93 9 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTGTGGGAGATCACTTCA 998 

Qy 936 GAGAGAT G CT GAT T AGT AAGT T C AGACAAT AC T T CAAGT C C C T T ACAT C C T T CAG GAC AT 995 

I I I I I I I M I II II MUM I II I I I I I || I II I II I I II I II II I 
Db 999 GG GACAT GCT GAT GAAT CAACT GAGAC ACAAC T T CAAAT C C CT T ACAT C CT T TAG CAGAT 1058 

Qy 99 6 GAGCT GCT GGAT GC AG GT C T T C ACT C AGC CAAAA- T GAGAC AC T T GAT AAACAG 104 8 

MM Ml I I I II II I I I I I M II I MM MUM 

Db 1059 GG GCT CAT GAACT C C T ACT T T CAT T CAGAGAAAAGT GAG GGGCT T GT GAAAC AG 1112 



RESULT 13 
AAC81122 

ID AAC81122 standard; cDNA; 1385 BP. 
XX 

AC AAC81122; 
XX 

DT 14-FEB-2001 (first entry) 
XX 

DE Human secreted protein gene 37 SEQ ID NO: 47. 
XX 

KW Human; secreted protein; diagnosis; immunosuppressive; antiarthritic; 

KW antirheumatic; antiproliferative; cytostatic; cardiant; vasotropic; 

KW cerebroprotective; nootropic; neuroprotective; antibacterial; virucide; 

KW fungicide; ophthalmological ; vulnerary; gene therapy; autoimmune disease; 

KW hyperproliferative disorder; cardiovascular disorder; angiogenesis; 

KW cerebrovascular disorder; nervous system disorder; infection; skin aging; 

KW ocular disorder; wound healing; food additive; preservative; ss. 

XX 

OS Homo sapiens. 



PN WO200061628-A1. 
XX 

PD 19-OCT-2000. 
XX 

PF 0 6-APR-2000; 2000WO-US009070 . 
XX 

PR 09-APR-1999; 99US-012 8695P . 

PR 14-JAN-2000; 2 0O0US-0176O52P . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Ruben SM, Komatsoulis G; 
XX 

DR WPI; 2000-619228/59. 

DR P-PSDB; AAB45344. 
XX 

PT New nucleic acid molecules encoding 49 human secreted proteins for 

PT diagnosing, preventing, treating or ameliorating medical conditions and 

PT used as food additives or preservatives. 

XX 

PS Claim 1; Page 412; 454pp; English. 
XX 

CC The polynucleotide sequences given in AAC81086 to AAC81134 encode the 

CC human secreted proteins given in AAB45308 to AAB45356. AAB45357 to 

CC AAB45384 represent human secreted polypeptide sequences and proteins 

CC homologous to them, which are given in the exemplification of the present 

CC invention. Human secreted proteins have activities based on the tissues 

CC and cells the genes are expressed in. Examples of activities include: 

CC antiarthritic; immunosuppressive; antirheumatic; antiproliferative; 

CC cytostatic; cardiant; vasotropic; cerebroprotective; nootropic; 

CC neuroprotective; antibacterial; virucide; fungicide; ophthalmological ; 

CC and vulnerary. The polynucleotides and polypeptides can be used to 

CC prevent, treat or ameliorate a medical condition in e.g. humans, mice, 

CC rabbits, goats, horses, cats, dogs, chickens or sheep. They are also used 

CC in diagnosing a pathological condition or susceptibility to a 

CC pathological condition. Disorders which are diagnosed or treated include 

CC autoimmune diseases, hyperprolif erative disorders, cardiovascular 

CC disorders, cerebrovascular disorders, angiogenesis, nervous system 

CC disorders, infections caused by bacteria, viruses and fungi and ocular 

CC disorders. The polypeptides can also be used to aid wound healing and 

CC epithelial cell proliferation, to prevent skin aging due to sunburn, to 

CC maintain organs before transplantation, for supporting cell culture of 

CC primary tissues, to regenerate tissues and in chemotaxis. The 

CC polypeptides can also be used as a food additive or preservative to 

CC increase or decrease storage capabilities, fat content, lipid, protein, 

CC carbohydrate, vitamins, minerals, cofactors and other nutritional 

CC components. AAC81077 to AAC81085 and AAB45307 represent sequences used in 

CC the exemplification of the present invention 

XX 

SQ Sequence 1385 BP; 385 A; 296 C; 275 G; 429 T; 0 U; 0 Other; 

Query Match 37.6%; Score 580.4; DB 3; Length 1385; 

Best Local Similarity 75.2%; Pred. No. 1.7e-136; 

Matches 763; Conservative 0; Mismatches 246; Indels 5; Gaps 3; 



QY 39 GC AGAAT G G C AC AGAAT T TAT C T T GT GAGAAT T G GT T GGCAAC AGAG G C TAT C T T GAAT A 98 



56 G GAT CAT G G CAT G GAAT GCAACT T G CAAAAACT G G C T GG CAG C AGAGG CT G C C CT GGAAA 115 

99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 
I I I I I I I I II II I I I I II I II I I I I I I I I I I M I II I I I I I I II 

116 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 17 5 

159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

HI M I I I I I I M I I I I I I I I I I I I I I I | | | | | | | | || Ml | | | | | | | 
176 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 235 

219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 27 8 

I I I I I M II I I I I I I I I I I I I I | | I I I I I I I I | | | | | M I I I I I I I I I I I I I 
236 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 295 

27 9 AT G C CAAT GATAAGGG GAC CT AT GGAGAT GT T C T CT GT AT AAGCAAC C GAT AT GT G CT T C 338 

I I I I I I I I I II III I I I II I I I II I I I II I I I I II II M I M I I I I I I I I I 
296 AT GC CAAT G GAAACT GGAT AT AT GGAGAC GT G C T CT G CAT AAGCAAC C GAT AT GT GCT T C 355 

339 AC AC CAAC C T CT AC AC CAGC AT C CT C T T C CT C AC T T T CAT T AGC AT GGAC C GAT AT CT GC 398 

I I I I I I II I I I I I I I I I II MM I I I I I I I I II I I I I I I I I | | | | || 
356 AT GC C AAC CT CT AT AC CAGC AT TCTCTTTCT C ACT T T TAT CAGC AT AGAT C GAT AC TT GA 415 

399 T CAT GAAGT AC C C T T T C C GAGAACACTT T CT ACAAAAGAAGGAAT T T GC C ATT T T AAT CT 4 58 

I II I I I I I I I I I I I I I I I I I I I I MM I I II I I I I M I I I II I I I II I I I I I 
416 TAAT T AAGT AT C C T T T C C GAGAACAC CT T CT GCAAAAGAAAGAGT T T GC T AT T T T AAT CT 475 

459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I I I II I I I I I I II I I I I I I I I I I I II I I I I I I I I || I 
47 6 CCTTGGCCATTTGGGTTTTAGTT^ACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 535 

519 C T GT C C CAAAAGAAGAGGGC AGTAACT G CAT C GACT AT GCAAGT T CT G GAAAC C CT GAAC 578 

I I M II M Mill I I I I I II I II I I I I I II I I I I I I I I I 
536 CT GT TAT AAC T GACAAT G GC AC C AC CT GTAAT GAT TTT GCAAGT T CT GGAGAC C C CAACT 595 

57 9 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I M I II M I I M I I I I II II II I I I I I M Mill I I I II M I I I II I I 
596 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 655 

639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 

I I I I I I I I I M I I I I I I II I II I I I I I I I I I I M I I I I I II I 

65 6 T GT GT T T CTT TT AT T ACAAGAT TGCTCTCTTC CTAAAGC AGAGGAAT AG GCAG GT T GC T A 715 

699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I M I I I I I I II I I I I I I I | | || | II I II I I I II I I M I I I I I 

716 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 775 

759 TACT CTT C AC AC CCT AT CAT AT CAT GC GCAAT T T GAGGAT C GC CT CAC GC CT G GAT AGT T 818 

I M M I I I I I I I M I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I II 
77 6 TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 835 

819 G GC C ACAAGGAT GT AC AC AGAAGGC CAT CAAAT C T AT AT ACAC AC T GAC AC G G C CT C 875 

I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I 

836 G GAAG C AGT AT C AGT GCAC T CAG GT C GT CAT CAAC T C CT T TT AC ATT GT GAC AC - GC C T T 8 94 

876 TGGCCTTTCT GAACAGT G C CAT CAAT C C CAT CT T C T ACT T CCT CAT GG GAGAC CAT T AC A 935 
I I I M I I I I I I I | | | | || | | | | || | | I I | | | | | M | I I I I I I I I I I I I I 



Db 



8 95 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 954 



QY 93 6 GAG AGAT G C T GAT T AGT AAGT T CAGAC AAT AC T T CAAGT CC CT T AC AT C C T T C AGGACAT 995 

I M I I II I I | | | | | | I | | | | I I I I I I I I I I I I I | | | | | | | | | | | M 
Db 955 G GG AC AT G CT GAT GAAT C AAC T GAGAC ACAACT T CAAAT CC CT T AC AT C C T T TAG C AGAT 1014 

QY 996 GAG C T GCT G GAT G C AG GT CT T C ACT CAG C C AAAA- T GAGACACT T GAT AAACAG 104 8 

IIM M I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 1015 G GGC T CAT GAAC T C CT AC T T T C ATT CAGAGAAAAGT GAG GGGCTTGT GAAACAG 1068 



KW 
KW 



RESULT 14 
ADC12679 

ID ADC12679 standard; DNA; 1005 BP. 
XX 

AC ADC12679; 
XX 

DT 18-DEC-2003 (first entry) 
XX 

DE Human GPCR gene, SEQ ID No 11. 
XX 

KW G protein-coupled receptor; GPCR; antibacterial; fungicide; protozoacide ; 

KW virucide; antirheumatic; antiarthritic; tranquiliser ; antidiabetic; 

KW osteopathic; nootropic; neuroprotective; anorectic; cardiant; 

KW neuroleptic; cytostatic; antiparkinsonian; hypotensive; hypertensive; 

KW antiulcer; antiallergic; anticonvulsant; analgesic; infection; 

KW rheumatoid arthritis; chronic obstructive pulmonary diseases; COPD; 

KW asthma; non-insulin dependent diabetes; obesity; osteoporosis; 

KW Alzheimer f s disease; age-related macular degeneration; 

myocardial infarction; schizophrenia; osteoarthritis; cancer; 
Parkinson f s disease; congestive heart failure; hypotension; hypertension; 

KW ulcer; allergy; benign prostatic hyperplasia; seizure disorder; anxiety; 

KW obsessive compulsive disorder; Cushing's syndrome; hypopituitarism; pain; 

KW gene; ds; human. 
XX 

OS Homo sapiens. 
XX 

PN WO2003000893-A2. 
XX 

PD 03-JAN-2003. 
XX 

PF 24-JUN-2002; 2002WO-IB002357 . 
XX 

PR 26-JUN-2001; 2001US-0301095P . 

PR 06-NOV-2001; 2001US-0333185P . 
XX 

PA (DECO-) DECODE GENETICS EHF. 
XX 

PI Martinez RMA, Sigurdsson GT; 
XX 

DR WPI; 2003-210155/20. 

DR P-PSDB; ADC12680. 
XX 

PT New G protein-coupled receptor (GPCR) genes and polypeptides, useful for 

PT diagnosing diseases associated with a GPCR, or in gene therapy for 

PT treating e.g. obesity, osteoporosis, Alzheimer's, cancers or congestive 

PT heart failure. 



Claim 1; SEQ ID NO 11; 253pp; English. 

The invention relates to a novel isolated nucleic acid of a G protein- 
coupled receptor (GPCR) gene comprising any of 62 sequences of 912-2454 
bp, or its complements; a GPCR polypeptide comprising any of 62 sequences 
of 291-818 amino acids; or a nucleic acid that hybridises, under high 
stringency conditions, with any of the 62 GPCR sequences or any of their 
complements. The GPCR agents of the invention have the following 
activities: antibacterial, fungicide, protozoacide, virucide, 
antirheumatic, tranquiliser, antiarthritic, antidiabetic, osteopathic, 
nootropic, neuroprotective, anorectic, cardiant, neuroleptic, cytostatic, 
antiparkinsonian, hypotensive, hypertensive, antiulcer, antiallergic, 
anticonvulsant, and analgesic. The GPCR therapeutic agent, particularly a 
GPCR gene agonist or antagonist, is useful for treating a disease or 
condition associated with a GPCR in an individual. The nucleic acid cited 
above, which is 100 or fewer nucleotides in length, is useful for 
assaying a sample for the presence of the GPCR gene nucleic acid or a 
GPCR gene nucleic acid with at least one nucleotide difference from a 
first nucleic acid, or for diagnosing a susceptibility to a disease or 
conditions associated with a GPCR. These diseases include infections 
(e.g. bacterial, fungal, protozoan or viral), rheumatoid arthritis, 
chronic obstructive pulmonary diseases (COPD) , asthma, non-insulin 
dependent diabetes, obesity, osteoporosis, Alzheimer's disease, age- 
related macular degeneration, myocardial infarction, schizophrenia, 
osteoarthritis, cancers, Parkinson 1 s diseases, congestive heart failure, 
hypotension, hypertension, ulcers, allergies, benign prostatic 
hyperplasia, seizure disorder, anxiety, obsessive compulsive disorder, 
Cushing's syndrome, hypopituitarism, or pain. This polynucleotide 
sequence represents one of the 62 GPCR gene sequences of the invention. 

Sequence 1005 BP; 244 A; 246 C; 187 G; 328 T; 0 U; 0 Other; 

Query Match 37.1%; Score 572.8; DB 9; Length 1005; 

Best Local Similarity 74.6%; Pred. No. 1.2e-134; 

Matches 734; Conservative 0; Mismatches 247; Indels 3; Gaps 1; 

Qy 8 6 G CT AT CTT GAAT AAGT ACT AC C T C T C T G CAT T TT AT GCAAT C GAGT TC AT TT TT GGAC T G 145 

Ml I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 1 GCTGCCCTGGAAAAGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTC 60 

Qy 14 6 CTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGC 205 

Mill III II I I I I I I I I I I I I I I I I I I I I I I || | | | | | | | | | | | | || 
Db 61 CTTGGAAATACCATTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGT 120 

Qy 206 AATGTCTATCTTTTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATC 2 65 

Ml I I I I I I I I I I I I II II I II I I I I I I I I I I I I I I I I | | | || | | Mill 
Db 121 AATATTTATCTCTTTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATG 180 

QY 266 CT GATAAAGAGT TAT GC C AAT GATAAG GG G AC C TAT GGAGAT GT T C T C T GT AT AAGCAAC 325 

I I I I I I I I M I I I I II II Ml II I I I I || I I | | || | || | | 

Db 181 C T GAT AAGGAGT TAT GC CAAT G GAAACT G GAT AT AT G GAGACGT G CT CT GC AT AAGCAAC 24 0 

Qy 32 6 CGATATGTGCTTCACACCAACCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCATG 3 85 

I M I I M I I I I I I I I I I I I II I II I II I I I I I | | | | | | | | | | I II M Mill 
Db 241 C GAT AT GT GC T T CAT GC C AAC CT C TAT AC C AGC ATT CTCTTTCT C AC T TT T AT C AGC AT A 300 



XX 
PS 
XX 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
XX 
SQ 



Qy 


386 


GAC C GAT AT C T GC T CAT GAAGTAC C CT T T C C GAGAAC AC T T T CT ACAAAAGAAG GAAT T T 
II 1 1 1 1 1 II 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I | I | | I I I I || Ml 
GAT CGAT AC T T GAT AAT T AAGTAT C CT T T C C GAGAAC AC C T T CT G CAAAAGAAAGAGT T T 


445 


Db 


301 


360 


Qy 


446 


GCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTC 
II 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II II || || | || | | | | I I I I 1 1 1 I 
GCTATTTTAATCTCCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTT 


505 


Db 


361 


420 


Qy 


506 


ACTTTCATCAATTCTGTCCC7\AAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCT 

1 1 1 1 II II II Mill Mill 1 1 II 1 1 1 1 1 1 M 1 

C C C CT T ATAAAT C C T GT T ATAACT GACAAT GGC AC CAC CT GT AAT GAT TT T GCAAGT TCT 


565 


Db 


421 


480 


Qy 


566 


GGAAAC C C T GAAC ACAAT C T C ATTT AC AG CCTCTGCCT GACT T T GT T GGG C T T C CTAAT T 
III MM 1 M 1 II 1 1 1 1 1 II 1 1 1 II 1 M 1 1 1 1 1 M II II M II 1 1 I 
GGAGACCCCAACTACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATT 


625 


Db 


481 


540 


Qy 


62 6 


CCTCTCTCTGTGATGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGC 
1 1 M 1 M 1 1 1 1 1 1 1 II 1 1 II 1 M 1 1 1 1 1 1 II || 1 | | | || M | | | | | 
CCTCTTTTTGTGATGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAAT 


685 


Db 


541 


600 


Qy 


686 


CAGCAGCAAGCAACTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTT 

1 1 II 1 i Ml II ll II II Mill II MM II 

AGGCAGGTTGCTACTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTG 


745 


Db 


601 


660 


Qy 


746 


GT GAT CT T CT C T AT ACT CT T C ACAC C C TAT CAT AT CAT GC GCAAT T T GAGGAT C GC C T CA 

M 1 1 1 1 1 II II 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I M M 1 1 II 1 1 1 1 1 1 1 Ml 

GTAATCTTCTCTGTGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCA 


805 


Db 


661 


720 


Qy 


806 


CGCCTGGATAGTTG GC CACAAGGAT GT ACAC AGAAGGC CAT CAAAT CT AT AT AC AC A 

1 1 1 1 1 1 1 UNI II 1 1 II II 1 1 1 llllll II 1 1 1 1 1 
CGCCTGGGGAGTTGGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATT 


862 


Db 


721 


780 


Qy 


863 


CTGACACGGCCTCTGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATG 
1 1 1 M 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 II 1 1 1 1 1 MINI II 1 1 1 II II II II II 
GTGACACGGCCTTTGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTG 


922 


Db 


781 


840 


Qy 


923 


GGAGAC CATT ACAGAGAGAT GCT GAT TAGTAAGTT CAGACAATACTTCAAGTCCCTT ACA 

1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 MINI 1 1 1 II 1 1 1 1 1 II 1 1 1 1 

GGAGAT CACTT CAGGGACAT GCT GAT GAAT CAACT GAGACACAACTT CAAAT C CCTTACA 


982 


Db 


841 


900 


Qy 


983 


T C CT T CAGGAC AT GAG C T GCT GGAT G CAGGT CTT C ACT C AGC CAAAAT GAGAC ACT T GAT 

t l l l l ll I til mi ii i i i i i i i i i i ii<iiii i i ■ 
1 1 1 1 1 II 1 Ml III III 1 1 1 1 1 1 1 1 1 [MINI III 

T C C TT T AG C AGAT GGG CT CAT GAAC T C CT ACT T T CATT C AGAGAAAAT GAT TCTCCTTCC 


1042 


Db 


901 


960 


Qy 


1043 


AAACAGTGCTGTGCAGTTGAGTTT 1066 

M III llllll 
T CAC C CTCCT CAAAT GGTGC GAT T 984 




Db 


961 





RESULT 15 
ADE85578/C 

ID ADE85578 standard; DNA; 639 BP. 
XX 

AC ADE85578; 
XX 

DT 29-JAN-2004 (first entry) 



Farnesyl transferase inhibitor modulated leukemia associated gene #797, 



DE 
XX 

KW ss; cytostatic; farnesyl transferase inhibitor; gene expression; 

KW quinolinone; leukemia; cancer. 

XX 

OS Homo sapiens. 
XX 

PN WO2003038129-A2. 
XX 

PD 08-MAY-2003. 
XX 

PF 30-OCT-2002; 2002WO-US034784 . 
XX 

PR 30-OCT-2001; 2001US-0338997P . 

PR 30-OCT-2001; 2 001US-0340081P . 

PR 30-OCT-2001; 2001US-0340938P . 

PR 30-OCT-2001; 2001US-0341012P . 
XX 

PA (ORTH ) ORTHO CLINICAL DIAGNOSTICS INC. 
XX 

PI Raponi M; 
XX 

DR WPI; 2003-513497/48. 
XX 

PT Determining whether a patient will respond to treatment with a farnesyl 

PT transferase inhibitor, by analyzing the expression of gene that is 

PT differentially modulated in the presence of the inhibitor. 
XX 

PS Disclosure; SEQ ID NO 797; 346pp; English. 
XX 

CC The invention relates to a method of determining whether a patient will 

CC respond to treatment with a farnesyl transferase inhibitor (FTI), by 

CC analyzing the expression of gene that is differentially modulated in the 

CC presence of an FTI. The method is useful for determining whether a 

CC patient will respond to treatment with a FTI such as (B) -6- [amino ( 4- 

CC chlorophenyl ) ( l-methyl-lH-imidazol-5-yl ) methyl ] -4- ( 3-chlorophenyl ) - 1- 

CC methyl-2- (1H) quinolinone, monitoring the therapy of a patient, treating a 

CC patient with leukemia with FTI if the analysis indicates that the patient 

CC will respond. This sequence corresponds to a gene whose expression may be 

CC modulated in the presence of FTI. 

XX 

SQ Sequence 639 BP; 189 A; 131 C; 131 G; 188 T; 0 U; 0 Other; 



Query Match 10.3%; Score 158.8; DB 9; Length 639; 

Best Local Similarity 72.2%; Pred. No. 6.8e-30; 

Matches 221; Conservative 0; Mismatches 82; Indels 3; Gaps 1; 

QY 727 CCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTTCACACCCTATCATATCATGCG 786 

I I I I I I I I I I I I I I I I II I I I M I I I I 

Db 625 CT T GGT C AT CAT GGC ACT G GT AAT TAT C T CT GT GC TAT TAAC AC CAT AT C AC GT CAT GC G 566 

Qy 787 CAAT T T GAGG AT C GC CT CAC G C CT G GAT AGT T G GC CACAAGGAT GT AC AC AGAAGGC 843 

I I I I I I I I I Mill II I I I II I I I I 

Db 565 GT AT GT GAGGAT C G CT T CAC G C CT GGT GAGT T GAAAG C AGT AT C AGT G CAC T C AGGT C GT 506 

Qy 844 CATCAAAT CT ATATACACACT GACACGGC CT CT GGCCTTT CT GAACAGT GC CAT CAAT C C 9 03 
I I I I I I I I I I I I I I I I I I I | | | | || | || M I I I I I I I I I I 



Db 5 05 CATCAACTCCTTTTACATTGTGACACGGCCTTTGGCCTTTCTGAACAGTGTCATCAACCC 446 

QY 904 CAT CTT CTACTT CCT CAT GGGAGAC CAT TACAGAGAGAT GCT GATT AGTAAGTT CAGACA 9 63 

I I I M M II II II I I II I I II I III II I I II I M I I I I I M I I I 

Db 445 T GT CT T CT AT T T T CT TAT GG GAGAT CAC T T C AG G GAC AT GC T GAT GAAT CAAC T GAGAC A 38 6 

Qy 964 AT ACT T C AAGT C C CT T AC AT C CT T C AGGAC AT GAG CT G C T G GAT GCAGGT CTT C ACT CAG 1023 

I I I I I I II I || I I I I | | | | || | | | Ml | | | | | | | | | 

Db 385 CAACTTCAAATCCCTTACATCCTTTAGCAGATGGGCTCATGAACTCCTACTTTCATTCAG 326 

Qy 1024 CCAAAA 1029 

I I I I 

Db 325 AGAAAA 320 



Search completed: August 24, 2004, 13:06:51 
Job time : 661 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: August 24, 2004, 12:29:30 ; Search time 124 Seconds 

(without alignments) 
6905.558 Million cell updates/sec 

Title: US-0 9-891-138A-1 

Perfect score: 1543 

Sequence: 1 gctcctggcagagttttctg tgcctaaataaatcaatata 1543 

Scoring table: IDENTITY__NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 682709 seqs, 277475446 residues 

Total number of hits satisfying chosen parameters: 1365418 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued Patents NA: * 



/cgn2_6/ptodata/2/ina/5A_COMB. seq: * 
/cgn2_6/ptodata/2/ina/ 5B_COMB . seq : * 
/cgn2_6/ptodata/2/ina/6A_COMB.seq: * 
/cgn2_6/ptodata/2/ina/6B_COMB. seq: * 
/cgn2_6/ptodata/2/ina/PCTUS_COMB. seq: * 
/cgn2_6/ptodata/2/ina/backf ilesl . seq: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 

US-08-559-524A-1 

; Sequence 1, Application US/08559524A 
; Patent No. 5871963 

GENERAL INFORMATION: 

APPLICANT: Conley, Pamela B. 
APPLICANT: Jantzen, Hans-Michael 
; TITLE OF INVENTION: NOVEL PURINERGIC RECEPTOR 

; NUMBER OF SEQUENCES: 14 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: MORGAN, LEWIS & BOCKIUS LLP 
; STREET: 1800 M Street, N.W. 

CITY: Washington 
STATE: D.C. 
COUNTRY: USA 
ZIP: 20036-5869 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 



COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.30 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/559 , 524A 
; FILING DATE: 15-NOV-1995 

CLASSIFICATION: 435 
; ATTORNEY/AGENT INFORMATION: 

NAME: Adler, Reid G. 

REGISTRATION NUMBER: 30,988 
; REFERENCE/DOCKET NUMBER: 0444 81-5010-00-US 

TELECOMMUNICATION INFORMATION: 

TELEPHONE: 202-467-7000 
; TELEFAX: 202-467-7176 

INFORMATION FOR SEQ ID NO: 1: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 1996 base pairs 

TYPE: nucleic acid 
; STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/ KEY: CDS 

LOCATION: 62 5.. 162 6 
US-08-559-524A-1 

Query Match 38.2%; Score 589.2; DB 2; Length 1996; 

Best Local Similarity 75.1%; Pred. No. le-156; 

Matches 762; Conservative 0; Mismatches 248; Indels 4; Gaps 2; 

Qy 39 GCAGAAT GG C AC AGAATT T AT C T T GT GAGAAT T G GT T GGCAACAGAGGCT AT CTT GAAT A 98 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 632 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 691 

Qy 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | || 

Db 692 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 751 

Qy 159 CT GT GGT GT T C G GCT AC C T CT T C T GCAT GAAGAAC T GGAACAGCAGCAAT GT C T AT CT T T 218 

Ml M I I I I I I I I MINI I I I II II M II I I M || I I I I I | Mill I 
Db 752 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTT^lTATTTATCTCT 811 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I I I I I M I I I II I I II I I I I II I I I I I I I II I I i I I I I I I I I I I Mill 

Db 812 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 871 

Qy 279 AT G C CAAT GAT AAG G G GAC C TAT GGAGAT GT T C T C T GT AT AAGCAAC C GAT AT GT GCT T C 338 

M I I I M I I M Ml M I II I II II Mill I I I I I I I I II I I M I I I I I I I I 
Db 872 AT GC CAAT GGAAACT GGAT AT AT G GAGAC GT G CT C T G CAT AAG CAAC C GAT AT GT GCT T C 931 

Qy 339 ACAC CAAC C T C T AC AC C AG CAT C CT C T T C CT C AC T T T CAT T AGCAT GGAC C GAT AT CT G C 398 

I I I I I I I I I I I I I I II II I II I I I I I I I II I I II Mill II I I I I I II 
Db 932 AT GC CAAC CT C TAT AC C AGCAT TCTCTTTCT C ACT T T TAT C AG C AT AGAT C GAT ACTT GA 991 

Qy 399 T CAT GAAGT AC C CT T T C C GAGAACAC T T T CT ACAAAAGAAG GAAT T T GC C AT T T TAAT C T 458 

I II Mill II I M I I II II II II I II I II I II I I I II I I II I II I I I I I I II 
Db 992 TAAT TAAGTAT CCTT T C C GAGAACAC CT T CT GCAAAAGAAAGAGT T T GCT ATTTTAAT CT 1051 



Qy 459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I MM I M M M M I I M M M I I M M M M M I I M Ml 

Db 1052 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 1111 

Qy 519 CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 578 

MM M M I M M I I II I M I M I M M M M M M M I 

Db 1112 CT GT T ATAAC T GACAAT G G C AC C AC CT GT AAT GAT T T T G CAAGT T CT G GAGAC C C CAACT 1171 

Qy 579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

MM M M M M M M I M M I I M M M I M M I M M M M I M M I 

Db 1172 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 1231 

Qy 639 T GT G C T T CT T C T AC TACAAGAT GGT AGT CT T C T TAAAGAGGAGGAGC C AG C AGCAAGCAA 698 

M M M M I M M M M M I M M I M M I M M I MM Ml 

Db 1232 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 1291 

Qy 699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I I I I I I I II II I I II I I I I II I I II II I I M I I I II II I II I 

Db 12 92 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 1351 

Qy 759 TACT CT T C ACAC C C TAT CAT AT CAT GC G CAAT T T GAGG AT C GC CT C AC GC CT GGATAGTT 818 

I II II M II I I I II M I M I I II III I I M II II I I M I I II II I I MM 

Db 1352 T GC TT T T T AC AC C CTAT C AC GT C AT GC GGAAT GT GAG GAT C GCTT C AC GC C T GGGGAGT T 1411 

Qy 819 G GC C ACAAG GAT GT AC AC AGAAGGC C AT C AAAT CTAT AT AC AC ACT GAC AC G GC CT C 875 

I M I I I II M I I II I II I II I I II I I I II I II I M 

Db 1412 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGGCTT 1471 

Qy 87 6 TGGCCTTTCT GAAC AGT G C CAT CAAT C C CAT CT T CT ACT T C C T C AT GGGAGAC CAT T AC A 935 

Ml I I M II II I I I I I I I I I I I I II I I I II II II II I M I I II M I II 
Db 1472 TGGGCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 1531 

Qy 936 GAGAGAT GCT GAT T AGT AAGT T CAGACAAT ACT T CAAGT C C CT T ACAT C CTT CAGGACAT 995 

I I I I I I I I M I I I I I I II II I I M I II M I II II I I M I I I I I I I I 
Db 1532 GGGAC AT GCT GAT GAAT CAAC T GAGAC ACAACTT CAAAT C C CT T ACAT C CT T T AGCAGAT 1591 

Qy 996 GAGCT G C T G GAT GC AG GT CT T C ACT CAG C CAAAA- T GAGAC ACT T GATAAAC AG 1048 

I I II Ml I II II II II II II II II I I II MUM 

Db 1592 GGGCT CAT GAACT C C TAC TT T CAT T C AGAGAAAAGT GAGG GG C T T GT GAAAC AG 1645 



RESULT 2 
US-08-749-707-1 

; Sequence 1, Application US/08749707 
; Patent No. 6063582 

GENERAL INFORMATION: 
; APPLICANT: Conley, Pamela B. 

APPLICANT: Jantzen, Hans-Michael 

TITLE OF INVENTION: NOVEL PURINERGIC RECEPTOR 

NUMBER OF SEQUENCES: 14 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: MORGAN, LEWIS & BOCKIUS LLP 

; STREET: 1800 M Street, N.W. 

; CITY: Washington 

STATE: D.C. 

COUNTRY: USA 



ZIP: 20036-5869 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/749,707 
FILING DATE: 15-NOV-1996 
CLASSIFICATION: 536 
ATTORNEY/AGENT INFORMATION: 
NAME: Adler, Reid G. 
REGISTRATION NUMBER: 30,988 

REFERENCE/ DOCKET NUMBER: 04 44 8 1-5010-01-US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 2 02-467-7000 
TELEFAX: 2 02-467-7176 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1996 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 62 5.. 162 6 
US-08-749-707-1 

Query Match 38.2%; Score 589.2; DB 3; Length 1996; 

Best Local Similarity 75.1%; Pred. No. le-156; 

Matches 7 62; Conservative 0; Mismatches 24 8; Indels 4; Gaps 2; 

Qy 39 G C AGAAT GGCAC AGAAT T TAT CT T GT GAGAAT T GGTT GG CAAC AGAGGCT AT CT T GAAT A 98 

II I I I I I I III! I I II I I I I I I I I I I I I I I I I I M I I I I I I 
Db 632 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 691 

Qy 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

II I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I II I I I I I II 

Db 692 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 751 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I II I I I 
Db 752 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 811 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 812 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 871 

Qy 27 9 AT G C CAAT GAT AAG GG GAC CT AT GGAGAT GT T CT C TGT ATAAG CAAC C GAT AT GT GC T T C 338 

I I I I I I I I I II III II I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I II 
Db 872 AT G C CAAT GGAAAC T G GAT AT AT G GAGAC GT GCT CTGC AT AAG CAAC C GAT AT GT GC T T C 931 

Qy 339 AC AC CAAC CT CT ACAC C AGC AT CCTCTTCCT C ACT TT CAT T AGCAT G GAC C GAT AT CTGC 398 

I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I II I I I I I II I I I I I II 
Db 932 AT GC CAAC C T C TAT AC C AG CAT TCTCTTTCT C AC T T T TAT C AG CAT AG AT C GAT ACT T G A 991 



Qy 399 T CAT GAAGT AC C CT T T C C GAGAACACT T T CTACAAAAGAAGGAAT T T G C CAT T T T AAT CT 458 

I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 992 T AAT T AAGT AT C CT T T C C GAGAAC AC C T T C T GC AAAAGAAAGAGT T T G C TAT T T T AAT C T 1051 

Qy 459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I I I I II I I I I I I I I I I I II I I I I I I I II II I I I I I I I 
Db 1052 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 1111 

Qy 519 C T GT CC CAAAAGAAGAGG G C AGT AAC T G CAT C GAC TAT G CAAGT T CT GGAAAC C CT GAAC 578 

I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1112 C T GT TAT AAC T G ACAAT G G C AC C AC C T GT AAT GAT T T T G CAAGTT CT GGAGAC C C CAACT 1171 

Qy 579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I I I I I II I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I II I I I II I 
Db 1172 ACAACCTCATTTACAGCATGTGTCT7VACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 1231 

Qy 639 T GT GCT T C TT CT ACT ACAAG AT GGT AGT CT T CT TAAAGAGGAGGAG C C AGC AG C AAG CAA 698 

I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I II I I III 

Db 1232 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 1291 

Qy 699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1292 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 1351 

Qy 759 TACT CT T C ACAC C C TAT CAT AT CAT G C G CAATT T GAGGAT C G C CT CAC G C C T GGATAGT T 818 

I II II I I I I I I I I I II II I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1352 TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 1411 

Qy 819 G GCCACAAGGAT GTACACAGAAGGCCATCAAATCTATATACACACT GACACGGCCTC 875 

I II I II II III I I I I I I I I I Mill I I I I I I I I I I 

Db 1412 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGGCTT 1471 

Qy 87 6 T GGC CT TT CT GAACAGT GC C AT C AAT C C C AT CT T C TACT T C CT CAT GGGAGAC CAT T AC A 935 

I I I I I I I I I I I II I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1472 TGGGCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 1531 

Qy 936 GAGAGAT GCT GAT T AGT AAGT T C AGAC AAT ACT T CAAGT C C C T T ACAT C CT T CAGGAC AT 995 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1532 GGGACAT G CT GAT GAAT CAACT G AGAC ACAACT T CAAAT C C C T T ACAT C CTT T AG CAGAT 1591 

Qy 996 GAGCT GCT G GAT GC AG GT CTT CAC T C AG C C AAAA- T GAGACACTT GAT AAAC AG 104 8 

III III I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 1592 G GGCT C AT GAAC T C C T ACT T T CAT T C AGAGAAAAGT GAG GG G CTT GT GAAAC AG 1645 



RESULT 3 
US-09-947-922-1 

; Sequence 1, Application US/09947922 
; Patent No. 6680373 

GENERAL INFORMATION: 
; APPLICANT: Conley, Pamela B. 

; Jantzen, Hans-Michael 

; TITLE OF INVENTION: NOVEL PURINERGIC RECEPTOR 

; NUMBER OF SEQUENCES: 14 

; CORRESPONDENCE ADDRESS: 

ADDRESSEE : MORGAN , LEWIS & BOCKIUS LLP 
STREET: 1800 M Street, N.W. 



CITY: Washington 
STATE: D.C. 
COUNTRY: USA 
ZIP: 20036-5869 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/947,922 
FILING DATE: 07-Sep-2001 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/08/749, 707 
FILING DATE: 15-NOV-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Adler, Reid G. 
REGISTRATION NUMBER: 30, 988 

REFERENCE/ DOCKET NUMBER: 0444 8 1-5010-01-US 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 202-467-7000 
TELEFAX: 202-467-7176 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 1996 base pairs 

; TYPE: nucleic acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
FEATURE : 

NAME /KEY: CDS 
LOCATION: 625.. 1626 
SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
US-09-947-922-1 

Query Match 38.2%; Score 589.2; DB 4; Length 1996; 

Best Local Similarity 75.1%; Pred. No. le-156; 

Matches 762; Conservative 0; Mismatches 248; Indels 4; Gaps 2; 
Qy 39 G C AGAAT GGCACAGAAT TTAT CT T GT GAGAAT T GGT T GG CAAC AGAGGCT AT CT T GAAT A 98 



Db 



632 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 691 



QY 



99 AGT ACT AC C T CT CT GC AT T T TAT G C AAT C G AGT T C AT TT T T GGACT G CT T G G GAAT GT C A 158 



Db 



692 AGT ACT AC C T T T C CAT T T T T TAT G GGAT T GAGT T C GT T GT GGGAGT C C TT G GAAAT AC C A 751 



159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 




Db 



752 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 811 



Qy 



219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 27 8 



Db 



812 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 871 



Qy 



279 AT G C CAAT GAT AAG GG GAC CT AT GG AGAT GTT CT C T GT AT AAG CAAC C GAT AT GT GCT T C 338 



1 1 1 1 1 1 1 1 1 II III II 1 1 1 1 1 1 II I II 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 

Db 872 AT GC CAAT G GAAACT G GAT AT AT G GAGAC GT G CT CT GC AT AAG CAAC C GAT AT GT G C T T C 931 

Qy 339 ACAC CAAC CT CT AC AC CAGC AT C CT CT T C CT C ACT T T CAT TAG CAT G GAC C GAT AT C T GC 398 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II INN II I I II I II 

Db 932 AT G C CAAC CT CT AT AC CAGC AT T C T CT T T CT CACT T T TAT C AG C AT AGAT C GAT AC T T GA 991 

Qy 399 T CAT GAAGT AC C C T T T C C GAGAAC ACT T T CT ACAAAAGAAGGAAT T T G C C ATT T TAAT CT 458 

I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I II 
Db 992 TAAT T AAGT AT CCT T T C C GAGAAC AC CT T CT G CAAAAGAAAGAGT T T G CT ATT T TAAT CT 1051 

Qy 459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I Mill I I I I I I I I I I I I I I I I I I II II I I I I I I I I I 
Db 1052 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 1111 

Qy 519 CT GT CC CAAAAGAAGAGGGCAGTAACT GCATC GACTAT GCAAGTTCT GGAAACC CT GAAC 57 8 

I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1112 C T GT T AT AACT GACAAT GGC AC C AC CT GT AAT GAT T T T GCAAGT T CT GGAGAC C C CAACT 1171 

Qy 579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I I I I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1172 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 1231 

Qy 639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I III 

Db 12 32 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 12 91 

Qy 699 C T GC C CT G C CACT GGACAAAC C C CAAC GCCTGGTGGTCCTGGCGGTTGT GAT CTTCTCTA 758 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I II I I 

Db 1292 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 1351 

Qy 759 TACT CT T CAC ACC CT AT CAT AT CAT GC GCAAT T T GAGGAT C GC CT CAC GC CT GGAT AGTT 818 

III II I I I I II I I M I I M I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1352 TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 1411 

Qy 819 G GC C ACAAGGAT GT AC AC AGAAGGC C AT CAAAT CT AT AT AC ACACT GAC AC G GC CT C 87 5 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1412 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGGCTT 1471 

Qy 876 TGGCCTTTCT GAACAGT GCC AT CAAT C C CAT CT T CT ACT T CCT CAT GG GAGAC CAT T AC A 935 

III I I I I I I I I I I I I II I I I I II II I I I I I I I II II I II I I I I II I M 
Db 1472 TGGGCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 1531 

Qy 936 GAGAGAT G CT GAT T AGTAAGT T CAGACAAT AC T T CAAGT C C CT T AC AT C CT T CAG GAC AT 995 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 1532 GGGACAT GCT GAT GAAT CAACT GAGACACAACTT CAAAT CC CTTACAT CCT TTAGCAGAT 1591 

Qy 996 GAG C T G CT G GAT GCAG GT CT T CACT CAG C CAAAA- T GAGAC AC TT GAT AAACAG 104 8 

I I I I III I I I I I I II I I I II I I I I I I I I I I I I I I 

Db 1592 GGGCT C AT GAACT C C T AC T T T CAT T C AGAGAAAAGT GAGGG G CT T GT GAAACAG 1645 



RESULT 4 

US-09-016-434-1068 

; Sequence 1068, Application US/09016434 
; Patent No. 6500938 
; GENERAL INFORMATION: 



APPLICANT: Janice Au-Young 
APPLICANT: Jeffrey J. Seilhamer 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 
TITLE OF INVENTION: PATHWAY GENE EXPRESSION 
NUMBER OF SEQUENCES: 1490 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: INCYTE PHARMACEUTICALS f INC. 
STREET: 3174 PORTER DRIVE 
CITY: PALO ALTO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Word Perfect 6.1 for Windows/MS-DOS 6.2 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 016, 434 
FILING DATE: HEREWITH 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 
NAME: Zeller, Karen J. 
REGISTRATION NUMBER: 37,071 
REFERENCE/DOCKET NUMBER: PA-0002 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (650) 855-0555 
TELEFAX: (650) 845-4166 
INFORMATION FOR SEQ ID NO: 1068: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1429 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 
LIBRARY: GEN BANK 
CLONE: gll24904 
US-09-016-434-1068 

Query Match 5.7%; Score 8 8.4; DB 4; Length 1429; 

Best Local Similarity 45.7%; Pred. No. 5.8e-15; 

Matches 385; Conservative 0; Mismatches 451; Indels 6; Gaps 2; 

Qy 107 CTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTG 166 

II III I I I I I I I I II I I I I M I I I I I I 

Db 292 CTGCCTGTGAGCTATGCAGTTGTCTTTGTGCTGGGCTTGGGCCTTAACGCCCCAACCCTA 351 

Qy 167 TTCGGCTACCTCTTCTGCATGi^AGAACTGGAACAGCAGCAATGTCTATCTTTTTAACCTT 22 6 

I II I Mill II I Mill I Ml III MM 

Db 352 TGGCTCTTCATCTTCCGCCTCCGACCCTGGGATGCAACGGCCACCTACATGTTCCACCTG 411 

Qy 227 TCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTTAT — -GCC 283 

I I II III I II I M II M I I I I I I MM Ml 



Db 412 GCATTGTCAGACACCTTGTATGTGCTGTCGCTGCCCACCCTCATCTACTATTATGCAGCC 471 

Qy 2 84 AAT GAT AAG G G G AC CT AT G GAG AT GT T C T CT GT ATAAG CAAC C GAT AT GT GCT T C AC AC C 343 

I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 472 CACAACCACTGGCCCTTTGGCACTGAGATCTGCAAGTTCGTCCGCTTTCTTTTCTATTGG 531 

Qy 344 AACCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCATGGACCGATATCTGCTCATG 403 

I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I M III 
Db 532 AACCTCTACTGCAGTGTCCTTTTCCTCACCTGCATCAGCGTGCACCGCTACCTGGGCATC 591 

Qy 4 04 AAGT AC C CT T T C CGAGAAC AC T T T C T AC AAAAGAAG GAAT T T GCC AT T T T AAT CT C GCT G 463 

I II I I I I I I I Ml I I I I I M 

Db 592 TGCCACCCACTTCGGGCACTACGCTGGGGCCGCCCTCGCCTCGCAGGCCTTCTCTGCCTG 651 

Qy 4 64 GCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATTCTGTC 523 

I I I I I I I I I I I I I I I I I I I I I 

Db 652 GCAGTTTGGTTGGTCGTAGCCGGCTGCCTCGTGCCCAACCTGTTCTTTGTCACAACCAGC 711 

Qy 524 C C AAAAGAAGAGGGC AGT AAC T G CAT C GACT AT GCAAGTT CT GGAAAC C CT GAAC ACAAT 583 

I I I I I III I II I I I I I I I Ml IN II 

Db 712 AACAAAG GGAC C AC C GT C CT GT GC CAT GAC AC C ACT C G GC CT GAAGAGT T T GAC C ACT AT 771 

Qy 584 CTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGC 643 

I I I I I I I I I I I I I I III I Ml 

Db 772 GTGCACTTCAGCTCGGCGGTCATGGGGCTGCTCTTTGGCGTGCCCTGCCTGGTCACTCTT 831 

Qy 644 TTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAACTGCC 703 

I I I I I I I I I III I I IN I 

Db 832 GTTTGCTATGGACTCATGGCTCGTCGCCTGTATCAGCCCTTGCCAGGCTCTGCACAGTCG 891 

Qy 704 CT GC CACT GGACAAAC C C CAAC GC CTGGTGGTCCTGGCGGTTGT GAT CTTCTCT AT ACT C 763 

II I I I I I I II I I I I I I I I I I I I I 

Db 8 92 TCTTCTCGCCTCCGCTCTCTCCGCACCATAGCTGTGGTGCTGACTGTCTTTGCTGTCTGC 951 

Qy 764 T T C AC AC CCT AT CAT AT C AT GC GCAAT T T GAGGAT C GC CT CAC GC CT GGATAGTT GGC C A 823 

III I I I I I I I I I I I I I I I I I M I I 
Db 952 TTCGTGCCTTTCCACATCACCCGCACCATTTACTACCTGGCCAGGCTGTTGGAA GCT 1008 

Qy 824 CAAGGAT GT AC ACAGAAG GC C AT C AAAT CT AT AT ACAC ACT GAC AC GG C CT CT GGC CT T T 8 83 

I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I 

Db 1009 GAC T G C C GAGT ACT GAAC AT T GT CAAC GT G GT CT AT AAAGT GACT C GGC C C CT GGC C AGT 1068 

Qy 8 84 CT GAACAGT GCCAT CAAT C CCAT CTT CTACT T CCT CAT GGGAGAC CAT TACAGAGAGAT G 943 

I I I I I II I I I I I M I I I II I I I I I I I I I I I I II I 
Db 1069 GCCAACAGCTGCCTGGATCCTGTGCTCTACTTGCTCACTGGGGACAAATATCGACGTCAG 1128 

Qy 944 CT 945 

I I 

Db 1129 CT 1130 



RESULT 5 

US-09-016-434-1456 

; Sequence 1456, Application US/09016434 

; Patent No. 6500938 

; GENERAL INFORMATION: 

; APPLICANT: Janice Au-Young 



; APPLICANT: Jeffrey J. Seilhamer 

; TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 

TITLE OF INVENTION: PATHWAY GENE EXPRESSION 
NUMBER OF SEQUENCES: 14 9 0 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: INCYTE PHARMACEUT I CALS , INC. 
STREET: 3174 PORTER DRIVE 
CITY: PALO ALTO 
; STATE: CALIFORNIA 

COUNTRY: USA 
ZIP : 94304 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE : Word Perfect 6.1 for Windows /MS-DOS 6.2 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 0 16 , 434 
FILING DATE: HEREWITH 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 

FILING DATE: 
CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 
NAME : Zeller, Karen J. 
; REGISTRATION NUMBER: 37,071 

; REFERENCE/DOCKET NUMBER: PA-0002 US 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (650) 855-0555 
TELEFAX: (650) 845-4166 
; INFORMATION FOR SEQ ID NO: 1456: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 3055 base pairs 

; TYPE: nucleic acid 

; STRANDEDNESS: single 

TOPOLOGY: linear 
IMMEDIATE SOURCE: 
; LIBRARY: GENBANK 

; CLONE: g798835 

US-09-016-434-1456 

Query Match 5.6%; Score 86.4; DB 4; Length 3055; 

Best Local Similarity 46.1%; Pred. No. 3.4e-14; 

Matches 4 02; Conservative 0; Mismatches 4 61; Indels 9; Gaps 3; 

Qy 80 AC AGAG GCT AT C T T GAAT AAGT ACT AC C T CT C T GCAT T T TAT G CAAT C GAGT T C ATT T T T 139 

II Ml III I I I I I I I I I IN I M M M I II I 

Db 982 ACCAAGACGGGCTTCCAGTTTTACTACCTGCCGGCTGTCTACATCTTGGTATTCATCATC 1041 

Qy 14 0 GGACT GCTT GGGAAT GTC ACT GT GGT GT T C GGCT AC CT CT T CT GCAT GAAGAACT GGAAC 199 

II I I I I I I I I I II I I I I I I I I I I I I I I I I M I I 

Db 1042 GGCTTCCTGGGCAACAGCGTGGCCATCTGGATGTTCGTCTTCCACATGAAGCCCTGGAGC 1101 

Qy 2 00 AGCAGCAATGTCTATCTTTTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTT 2 59 

I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I 

Db 1102 GGCATCTCCGTGTACATGTTCAATTTGGCTCTGGCCGACTTCTTGTACGTGCTGACTCTG 1161 



Qy 2 60 C C CAT C C T GAT AAAGAGT TAT G C CAAT GAT A AGGGGACCTATGGAGATGTTCTCTGT 316 

II II I I II II I I I I I I I II I I I I I I II I I III 

Db 1162 C C AGC C C T GAT C T T CT AC T AC T T CAAT AAAAC AGACT G GAT C T T C G G G GAT G C CAT GT GT 1221 

Qy 317 AT AAGCAAC C GAT AT GT G CT T C AC ACCAAC C T C T AC AC C AGC AT CCTCTTCCT C ACT T T C 37 6 

II I II I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1222 AAACT GCAGAG GT T CAT C T T T CAT GT GAAC C T C T A TGGCATCTTGTTTCTGACATGC 127 8 

Qy 377 AT T AGC AT GGAC C GAT AT CT G C T CAT GAAGT AC C CT T T C C GAGAAC ACTT T CT AC AAAAG 436 

I I I I I I I I I I I I I I I II I I I I I I I I 

Db 1279 ATCAGTGCCCACCGGTACAGCGGTGTGGTGTACCCCCTCAAGTCCCTGGGCCGGCTCAAA 1338 

Qy 437 AAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTA 496 

I I I I I II I I I III II I I I I I I I I I I II I 

Db 1339 AAGAAGAATGCGATCTGTATCAGCGTGCTGGTGTGGCTCATTGTGGTGGTGGCGATCTCC 1398 

Qy 4 97 CCCATGCTCACTTTCATCAATTCT GT CC CAAAAGAAGAGGGCAGTAACT GCAT C GAC 553 

I I I I I I I I I I II I I I I I I I I I II I I I I I I I I 

Db 1399 CCCATCCTCTTCTACTCAGGTACCGGGGTCCGCAAAAACAAAACCATCACCTGTTACGAC 1458 

Qy 554 TATGCAAGTTCTGGAAACCCTGAACACAATCTCATTTACAGCCTCTGCCTGACTTTGTTG 613 

I I III I I I I I I I II I I I I I I I I III II 

Db 1459 AC C AC C T C AGAC GAGT AC CT G C GAAGT TAT T T C AT CTAC AGC AT GT G C AC GACC GT GGC C 1518 

Qy 614 GGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTA 673 

III I I I I I I I I I I I I I I I I I II 

Db 1519 ATGTTCTGTGTCCCCTTGGTGCTGATTCTGGGCTGTTACGGATTAATTGTGAGAGCTTTG 1578 

Qy 674 AAGAG GAG GAG C CAG C AG CAAGCAACT GC C CT GC CAC T GGACAAAC C C CAAC GC CT G GT G 733 

I I I I I I I I I II I II II 

Db 1579 AT T T ACAAAGAT CT G G ACAACT CT C CT CT GAGGAGAAAAT C GATT T AC CT GGT AAT CAT T 1638 

Qy 734 GT C CT GG C GGT T GT GAT CT T CT CT ATACT CT T CAC AC C CT AT CAT AT CAT GC GCAAT T T G 793 

I I I I I I I I I I I I II II I III I I I I I 

Db 1639 GT ACT GACT GTTTTTGCTGT GT C T T AC AT C C CT T TC CAT GT GATGAAAAC GAT GAACT T G 1698 

Qy 7 94 AG GAT C G CCT C AC G C CT G GAT AGT T GG C CACAAGGAT GT ACAC AGAAGG C CAT CAAAT CT 853 

I I I I III II II II III 

Db 1699 AGGGCCCGGCTTGATTTTCAGACCCCAGCAATGTGTGCTTTCAATGACAGGGTTTATGCC 1758 

Qy 854 AT AT AC AC ACT GAC AC GGC CTCTGGCCTTTCT GAACAGT G C CAT CAAT C C CAT CT T C T AC 913 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1759 AC GT AT CAG GT GACAAGAG GT CT AGCAAGT CT CAACAGT T GT GT GGAC C C CATT CT CT AT 1818 

Qy 914 T T C CT CAT G GGAGAC CAT T AC AGAGAGAT GC T 945 

I I I I I I II I I I I I I II II II 
Db 1819 T T C T T GGCGG GAGAT ACT T T C AGAAGGAGAC T 1850 



RESULT 6 

US-09-016-434-1482 

; Sequence 1482, Application US/09016434 

; Patent No. 6500938 

; GENERAL INFORMATION: 

; APPLICANT: Janice Au-Young 

APPLICANT: Jeffrey J. Seilhamer 



TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 
TITLE OF INVENTION: PATHWAY GENE EXPRESSION 
NUMBER OF SEQUENCES: 1490 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

STREET: 3174 PORTER DRIVE 
CITY: PALO ALTO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE : Word Perfect 6.1 for Windows /MS-DOS 6.2 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/016, 434 
FILING DATE: HEREWITH 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 

FILING DATE: 
; CLASSIFICATION: 

ATTORNEY/ AGENT INFORMATION: 
; NAME: Zeller, Karen J. 

REGISTRATION NUMBER: 37,071 
; REFERENCE/DOCKET NUMBER: PA-0002 US 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: (650) 855-0555 
TELEFAX: (650) 845-4166 
; INFORMATION FOR SEQ ID NO: 1482: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 2025 base pairs 

; TYPE: nucleic acid 

; STRANDEDNESS: single 

; TOPOLOGY: linear 

IMMEDIATE SOURCE: 
LIBRARY : GENBANK 
CLONE: g984506 
US-09-016-434-1482 

Query Match 5.5%; Score 85.4; DB 4; Length 2025; 

Best Local Similarity 46.5%; Pred. No. 5.1e-14; 

Matches 389; Conservative 0; Mismatches 436; Indels 12; Gaps 3 
Qy 91 CTTGAATAAGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGG 150 



Db 



335 CTTCAAGTACGTGCTGCTGCCTGTGTCCTACGGCGTGGTGTGCGTGCTTGGGCTGTGTCT 394 



Qy 



151 GAAT GT C ACT GTGGTGTTC GG CT AC C T C T T C T GCAT GAAGAAC T GGAAC AGC AGC AAT GT 210 



Db 



395 GAACGCCGTGGCGCTCTACATCTTCTTGTGCCGCCTCAAGACCTGGAATGCGTCCACCAC 454 



Qy 



Db 



211 CTATCTTTTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGAT 270 

I I I I I I I I I I I I II I I I I I I I I II I I I I I I I 

455 ATATATGTTCCACCTGGCTGTGTCTGATGCACTGTATGCGGCCTCCCTGCCGCTGCTGGT 514 



Qy 271 AAAGAGT T AT GC GAAT GAT AAG G GG AC CT AT G GAGAT GT T CT CT GT AT AAGC AAC C G 327 

I I I II I I I I I I II I I I I I I I I I I II 

Db 515 CTATTACTACGCCCGCGGCGACCACTGGCCCTTCAGCACGGTGCTCTGCAAGCTGGTGCG 574 

Qy 328 AT AT GT G C TT CACAC CAAC C T C T ACAC C AG CAT CCTCTTCCT CAC T T T CAT T AGC AT GGA 3 87 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 575 CTTCCTCTTCTACACCAACCTTTACTGCAGCATCCTCTTCCTCACCTGCATCAGCGTGCA 634 

Qy 388 C C GAT AT CT G CT CAT GAAGT AC C CT T T C C GAGAACACTT T CT ACAAAAGAAG GAATT T GC 447 

I I I I I I I I I I I I I I I I I I IN 

Db 635 CCGGTGTCTGGGCGTCTTACGACCTCTGCGCTCCCTGCGCTGGGGCCGGGCCCGCTACGC 694 

Qy 44 8 CATTTTAATCTCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCAC 507 

I I I I I I I I I I I I I I I I I I I I I I II 

Db 695 TCGCCGGGTGGCCGGGGCCGTGTGGGTGTTGGTGCTGGCCTGCCAGGCCCCCGTGCTCTA 754 

Qy 508 TTTCATCAATTCTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGG 567 

Mill I M II M I I I I I I I I I I II 

Db 755 CTTTGTCACCACCAGCGCGCGCGGGGGCCGCGTAACCTGCCACGACACCTCGGCACCCGA 814 

Qy 568 AAACCCTGAACACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCC 627 

I II I I I I I I I III I I I I I I III 

Db 815 GCTCTTCAGCCGCTTCGTGGCCTACAGCTCAGTCATGCTGGGCCTGCTCTTCGCGGTGCC 874 

Qy 628 TCTCTCTGTGATGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCA 687 

I I II II I I I I I I I I I I I II II 

Db 875 CTTTGCCGTCATCCTTGTCTGTTACGTGCTCATGGCTCGGCGACTGCTAAAGCCAGCCTA 934 

Qy 688 GCAGCAAGCAACTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGT 747 

I I I I I I I I I I I I I I I I III I I I I I I I I 

Db 935 CGGGACCTCGGGCGGCCTCCCTAGGGCCAAGCGCAAGTCCGTGCGCACCATCGCCGTGGT 994 

Qy 748 G AT CT T CT CT AT ACT C T T CACAC C CT AT CAT AT CAT GC GCAAT T T GAGGAT C GC 801 

I I I I II I I I I I I I I I I I I I I I I I I I M 

Db 995 GCTGGCTGTCTTCGCCCTCTGCTTCCTGCCATTCCACGTCACCCGCACCCTCTACTACTC 1054 

Qy 802 CT C AC GCCT GGAT AGTT GGC CACAAG GAT GT AC ACAGAAGGC CAT CAAAT CT AT AT AC AC 861 

I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1055 CTTCCGCTCGCTGG ACCTCAGCTGCCACACCCTCAACGCCATCAACATGGCCTACAA 1111 

Qy 862 ACTGACACGGCCTCTGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCT 918 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I 
Db 1112 GGTTACCCGGCCGCTGGCCAGTGCTAACAGTTGCCTTGACCCCGTGCTCTACTTCCT 1168 



RESULT 7 

US-09-016-434-1108 

; Sequence 1108, Application US/09016434 

; Patent No. 6500938 

; GENERAL INFORMATION: 

; APPLICANT: Janice Au-Young 

APPLICANT: Jeffrey J. Seilhamer 
; TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 

; TITLE OF INVENTION: PATHWAY GENE EXPRESSION 

NUMBER OF SEQUENCES: 1490 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 



; STREET: 3174 PORTER DRIVE 

CITY: PALO ALTO 
; STATE: CALIFORNIA 

; COUNTRY: USA 

; ZIP: 94304 

; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE: Word Perfect 6.1 for Windows/MS-DOS 6.2 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/09/016,434 

; FILING DATE: HEREWITH 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 

CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 

NAME: Zeller, Karen J. 

REGISTRATION NUMBER: 37,071 

REFERENCE/DOCKET NUMBER: PA-0002 US 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: (650) 855-0555 

; TELEFAX: (650) 845-4166 

; INFORMATION FOR SEQ ID NO: 1108: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1571 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 

; IMMEDIATE SOURCE: 

LIBRARY : GENBANK 
; CLONE: gl296659 

US-09-016-434-1108 

Query Match 5.4%; Score 82.8; DB 4; Length 1571; 

Best Local Similarity 46.2%; Pred. No. 2.4e-13; 

Matches 390; Conservative 0; Mismatches 442; Indels 12; Gaps 3 
Qy 8 9 AT CT T GAAT AAGT AC T AC CT CT CT GC AT T T TAT G CAAT C GAGT T CAT T T T T GGAC T GC T T 14 8 



Db 



34 3 AACTTCAAGCAACTGCTGCTGCCACCTGTGTATTCGGCGGTGCTGGCGGCTGGCCTGCCG 4 02 



Qy 



149 GG GAAT GTCACTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGC AG CAAT 208 



Db 



4 03 CTGAACATCTGTGTCATTACCCAGATCTGCACGTCCCGCCGGGCCCTGACCCGCACGGCC 4 62 



Qy 



209 GTCTATCTTTTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTG 268 



Db 



4 63 GTGTACACCCTAAACCTTGCTCTGGCTGACCTGCTATATGCCTGCTCCCTGCCCCTGCTC 522 



Qy 



2 69 AT AAAG AGT TAT GC C AA TGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAAC 325 



Db 



523 ATCTACAACTATGCCCAAGGTGATCACTGGCCCTTTGGCGACTTCGCCTGCCGCCTGGTC 582 



Qy 



326 CGATATGTGCTTCACACCAACCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCATG 385 



Ill I I I 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 I 

Db 583 CGCTTCCTCTTCTATGCCAACCTGCACGGCAGCATCCTCTTCCTCACCTGCATCAGCTTC 642 

Qy 38 6 GAC C GAT AT CT G CT CAT GAAGT AC C CT TT C C G AGAAC AC T T T C T AC AAAAG AAG GAA 442 

I II II III III M I I I M I I I I 

Db 643 CAGCGCTACCTGGGCATCTGCCACCCGCTGGCCCCCTGGCACAAACGTGGGGGCCGCCGG 702 

Qy 443 TTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATG 502 

I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 7 03 GCTGCCTGGCTAGTGTGTGTAGCCGTGTGGCTGGCCGTGACAACCCAGTGCCTGCCCACA 762 

Qy 503 CT C ACT T T CAT CAAT T CT GT C C C AAAAGAAGAGG G CAGT AACT GC AT C GACT AT GCAAGT 562 

II I I I III I I I I I I I I I I III 

Db 763 GCCATCTTCGCTGCCACAGGCATCCAGCGTAACCGCACTGTCTGCTATGACCTCAGCCCG 822 

Qy 563 TCTGGAAACCCTGAACACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTA 622 

III I I I I I I I II III I I I I I I I I I II I I I I 

Db 82 3 CCTGCCCTGGCCACCCACTATATGCCCTATGGCATGGCTCTCACTGTCATCGGCTTCCTG 882 

Qy 623 ATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGG 682 

I I I I I M II I I I I I I I I II III I I 

Db 8 83 CTGCCCTTTGCTGCCCTGCTGGCCTGCTACTGTCTCCTGGCCTGCCGCCTGTGCCGCCAG 942 

Qy 683 AGCCAGCAGCAAGCAACTG CCCTGCCACTGGACAAACCCCAACGCCTGGTGGTC 736 

I I I I III I I I I I I I I I II I I I I I 

Db 943 GATGGCCCGGCAGAGCCTGTGGCCCAGGAGCGGCGTGGCAAGGCGGCCCGCATGGCCGTG 1002 

Qy 737 CTGGCGGTTGT GAT CT T C T CT AT ACT C T T C AC AC C CTAT CAT AT CAT G C G CAAT T T GAGG 796 

I I I I I I I III III I II I I I I II I I I I I I 

Db 1003 GTGGTGGCTGCTGCCTTTGCCATCAGCTTCCTGCCTTTTCACATCACCAAGACAGCCTAC 1062 

Qy 797 ATCGCCTCACGCCTGGATAGTTGGCCACAAGGATGTACACAGAAGGCCATCAAATCTATA 856 

III I I I I I I I I I I Mill I I I 

Db 1063 CTGGCAGTGCGCTCGACGCCGGGCGTCCCCTGCACTGTATTGGAGGCCTTTGCAGCGGCC 1122 

Qy 857 T AC ACAC T GAC AC GGC CT CTGGCCTTTCT GAACAGT G C CAT CAAT C C CAT CT T CT ACT T C 916 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 1123 TACAAAGGCACGCGGCCGTTTGCCAGTGCCAACAGCGTGCTGGACCCCATCCTCTTCTAC 1182 

Qy 917 CTCA 920 

I I I 

Db 1183 TTCA 1186 



RESULT 8 

US-08-405-271A-18 

; Sequence 18, Application US/08405271A 
; Patent No. 6432652 
; GENERAL INFORMATION: 

APPLICANT: EVANS, CHRISTOPHER J. 
; APPLICANT: KEITH, DUANE E. 

; TITLE OF INVENTION: OPIOID RECEPTOR GENES 

NUMBER OF SEQUENCES: 25 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: MORRISON & FOERSTER 

STREET: 2000 PENNSYLVANIA AVENUE, NW, Suite 5500 
CITY: WASHINGTON 



STATE : DC 
COUNTRY: USA 
ZIP: 20006-1888 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS T DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/405, 271A 
FILING DATE: 14 -MAR- 1995 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
NAME: MURASHIGE, KATE H. 
REGISTRATION NUMBER: 29,959 
REFERENCE/DOCKET NUMBER: 22000-20526.22 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (202) 887-1500 
TELEFAX: (202) 887-0763 
TELEX: 90-4 030 MRSNFOERSWSH 
INFORMATION FOR SEQ ID NO: 18: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1805 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
FEATURE: 

NAME/ KEY: CDS 
LOCATION: 10.. 1119 
US-08-405-271A-18 

Query Match 5.3%; Score 82.2; DB 4; Length 1805; 

Best Local Similarity 44.5%; Pred. No. 3.8e-13; 

Matches 379; Conservative 0; Mismatches 463; Indels 9; Gaps 1; 

Qy 85 GGCTATCTTGAATAAGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACT 144 

I I I I I I I I I I I I I III I I I I I I I I I 

Db 147 GCCCCTCGGGCTCAAGGTCACCATCGTGGGGCTCTACCTGGCCGTGTGTGTCGGAGGGCT 206 

Qy 145 GCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAG 204 

I I I I I I I I III I I I I I I MM II III II i I 

Db 207 C CT GG G GAACT GC CT T GT CAT GT AC GT CAT C C T C AG GC ACAC CAAAAT GAAG AC AGC C AC 2 66 

Qy 205 CAATGTCTATCTTTTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCAT 2 64 

II M I I I I II I II I I I Ml I I I I I II I I I M I I I I I I 

Db 267 CAATATTTACATCTTTAACCTGGCCCTGGCCGACACTCTGGTCCTGCTGACGCTGCCCTT 326 

Qy 265 C CT GAT AAAGAGT TAT G C CAAT GAT AAG GGGAC CT AT GGAGAT GTT CT C T GT AT AAG CAA 324 

III II I I I I I I I II II I I I I I I I 

Db 327 CCAGGGCACGGACATCCTCCTGGGCTTCTGGCCGTTTGGGAATGCGCTGTGCAAGACAGT 386 

Qy 325 CCGATATGTGCTTCACACCAACCTCTACACCAGCATCCTCTTCCTCACTTTCATTAGCAT 384 

I I II II II I I I I I II I M I I I II I II I I I I I I I 

Db 387 CAT T GC C AT T GACT ACT ACAAC AT GTT CAC CAG C AC C T T CAC C CT AACT G C CAT GAGT GT 446 

Qy 385 GGAC C GAT AT CT G CT CAT GAAGT AC C CT T T C C GAGAAC ACT T T C T ACAAAAGAAG GAAT T 444 

II I I I M I I Ml II I I Mill II II II 



Db 



447 GGATCGCTATGTAGCCATCTGCCACCCCATCCGTGCCCTCGACGTCCGCACGTCCAGCAA 506 



Qy 445 TGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCT 504 

III II I I I I I I M I I II I I I M I I I I I I I 

Db 507 AGCCCAGGCTGTCAATGTGGCCATCTGGGCCCTGGCCTCTGTTGTCGGTGTTCCCGTTGC 566 

Qy 505 C ACTT T CAT CAAT T C T GT C C C AAAAGAAGAGG G C AGT AAC T G CAT C GAC TAT G CAAGT T C 564 

II I I I I I I I I I I I I I I I I I II 

Db 567 CAT CAT G GGCT C G GCAC AG GT C GAGG AT GAAGAGAT C G AGT GC CT GGT GGAGAT C C CT AC 62 6 

Qy 565 TGGAAACCCTGAACACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAAT 624 

I II I II I lllll I III MM I I 

Db 627 CCCTCAGGATTACTGGGGCCCGGTGTTTGCCATCTGCATCTTCCTCTTCTCCTTCATCGT 686 

Qy 625 TCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAG 684 

III lllll I I I I I I I I I I I I I I I 

Db 687 CCCCGTGCTCGTCATCTCTGTCTGCTACAGCCTCATGATCCGGCGGCTCCGTGGAGTCCG 74 6 

Qy 685 CCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGT 744 

I I I I I I I I I I I I I I I I I I I I I 

Db 747 CCTGCTCTCGGGCTCCCGAGAGAAGGACCGGAACCTGCGGCGCATCACTCGGCTGGTGCT 8 06 

Qy 745 T GT GAT CT T CT C TAT ACT CT T C AC AC C C TAT CAT AT CAT GCGCAATT T GAGGAT C GC CT C 8 04 

III I I II I II I II II II I I II I 

Db 807 GGTGGTAGTGGCTGTGTTCGTGGGCTGCTGGACGCCTGTCCAGGTCTTCGTGCTGGCCCA 8 66 

Qy 8 05 AC GCCT GGAT AGT T G G C C ACAAGG AT GT AC AC AGAAGGC C AT CAAAT CT AT AT AC AC AC T 864 

I I I I I I I I I I I II I I I I I I lllll 
Db 8 67 AGGGCTGGGGGTTCAGCCGAGCAGCGAGACTGCCGTGGCCATTCTGCGCTTCTGCAC 92 3 

Qy 8 65 GACACGGCCTCTGGCCTTTCTGAACAGTGCCAT CAAT CCCATCTTCTACTTCCT CAT GGG 924 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 924 GGCCCTGGGCTACGTCAACAGCTGCCTCAACCCCATCCTCTACGCCTTCCTGGA 977 

Qy 925 AGACCATTACA 935 

11 I I I I 
Db 97 8 T GAGAACT T C A 988 



RESULT 9 

US-09-016-434-1391 

; Sequence 1391, Application US/09016434 

; Patent No. 6500938 

; GENERAL INFORMATION: 

; APPLICANT: Janice Au-Young 

; APPLICANT: Jeffrey J. Seilhamer 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 
; TITLE OF INVENTION: PATHWAY GENE EXPRESSION 

NUMBER OF SEQUENCES: 14 90 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 
STREET: 3174 PORTER DRIVE 
CITY: PALO ALTO 
; STATE: CALIFORNIA 

COUNTRY: USA 
ZIP: 94304 
COMPUTER READABLE FORM: 



MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Word Perfect 6.1 for Windows/MS-DOS 6.2 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/016, 434 
FILING DATE: HEREWITH 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 
NAME: Zeller, Karen J. 
REGISTRATION NUMBER: 37,071 
REFERENCE/DOCKET NUMBER: PA-0002 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (650) 855-0555 
TELEFAX: (650) 845-4166 
INFORMATION FOR SEQ ID NO: 1391: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1973 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 
LIBRARY: GENBANK 
CLONE: g471316 
US-09-016-434-1391 

Query Match 5.3%; Score 82.2; DB 4; Length 1973; 

Best Local Similarity 44.5%; Pred. No. 4e-13; 

Matches 379; Conservative 0; Mismatches 463; Indels 9; Gaps 1; 

Qy 8 5 GGCTATCTTGAATAAGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACT 144 

I I I I I I I I I I I I I III I I I I I MM 

Db 315 GCCCCTCGGGCTCAAGGTCACCATCGTGGGGCTCTACCTGGCCGTGTGTGTCGGAGGGCT 374 

Qy 145 GCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAG 204 

I I I II M I I II I I I I I I I I II M III II II 

Db 375 C CT GGG GAACT G C C T T GT CAT GT AC GT CAT C CT CAG GC AC AC CAAAAT GAAGAC AGC C AC 434 

Qy 2 05 CAATGTCTATCTTTTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCAT 264 

II II I M I II I I I II I II I I I I I I I I I I I I I II I I I I 

Db 4 35 CAATATTTACATCTTTAACCTGGCCCTGGCCGACACTCTGGTCCTGCTGACGCTGCCCTT 4 94 

Qy 265 C CT GAT AAAGAGT T AT GC CAAT GAT AAG GG GAC C TAT GGAGAT GT T CT C T GT AT AAGCAA 324 

II I II I I I I I I II I I II I I I I I I 

Db 4 95 CCAGGGCACGGACATCCTCCTGGGCTTCTGGCCGTTTGGGAATGCGCTGTGCAAGACAGT 554 

Qy 325 C C GAT AT GT GCT T CACACCAACCT CT ACAC CAGCAT C CT CT T CCT CACTT T CAT T AGC AT 384 

I I II I I I I I I I I I I II I I I I I I II I II M I II I 

Db 555 C ATT GC CAT T GAC T ACT ACAAC AT GT T C AC CAG CAC CT T C ACC CT AACT GC C AT G AGT GT 614 

Qy 3 85 G GAC C GAT AT CT GC T CAT GAAGT AC C CT T T C C GAGAAC ACT TT CT ACAAAAGAAGGAAT T 444 

I I I I I I I I I III I I I I I I I I I II M M 

Db 615 GGATCGCTATGTAGCCATCTGCCACCCCATCCGTGCCCTCGACGTCCGCACGTCCAGCAA 674 



Qy 445 TGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCT 504 

III II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 675 AGCCCAGGCTGTCAATGTGGCCATCTGGGCCCTGGCCTCTGTTGTCGGTGTTCCCGTTGC 734 

Qy 505 CACT T T CAT CAAT T CT GT CC CAAAAGAAGAG GG C AGT AACT G CAT C GAC TAT G C AAGT T C 564 

II I I I I I I I I I I I I I I I I I II 

Db 7 35 CAT CAT GGGCT CGG C AC AGGT C GAG GAT GAAGAG AT C GAGT GC CT GGT GGAGAT C C CT AC 794 

Qy 565 TGGAAACCCTGAACACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAAT 624 

I II I II I II I I I I III I I I I I I 

Db 7 95 CCCTCAGGATTACTGGGGCCCGGTGTTTGCCATCTGCATCTTCCTCTTCTCCTTCATCGT 854 

Qy 625 TCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAG 684 

III I I I I I I I I I I I I I I I I I I I I 

Db 855 CCCCGTGCTCGTCATCTCTGTCTGCTACAGCCTCATGATCCGGCGGCTCCGTGGAGTCCG 914 

Qy 685 CCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGT 744 

I I I I I I I I I I I I I I I I I I I I I 

Db 915 CCTGCTCTCGGGCTCCCGAGAGAAGGACCGGAACCTGCGGCGCATCACTCGGCTGGTGCT 974 

Qy 745 T GT GAT CT T CT CT AT ACT CT T C ACAC C CTAT CAT AT CAT G C GC AATT T GAGG AT C GC CT C 8 04 

I I I I I I I I I I I II II II I I I I I 

Db 975 GGTGGTAGTGGCTGTGTTCGTGGGCTGCTGGACGCCTGTCCAGGTCTTCGTGCTGGCCCA 1034 

Qy 8 05 AC GC CT GGAT AGT T GGC C ACAAG GAT GT AC AC AGAAGG C CAT CAAAT C TAT AT AC AC ACT 864 

I I I I I I I I I I I II I I I I I I I I I I I 
Db 1035 AGGGCTGGGGGTTCAGCCGAGCAGCGAGACTGCCGTGGCCATTCTGCGCTTCTGCAC 1091 

Qy 8 65 GACACGGCCTCTGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGG 924 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1092 GGCCCTGGGCTACGTCAACAGCTGCCTCAACCCCATCCTCTACGCCTTCCTGGA 1145 

Qy 925 AGACCATTACA 935 

II I I I I 

Db 114 6 T GAGAACT T C A 1156 



RESULT 10 

US-09-023-655-1417 

; Sequence 1417, Application US/09023655 
; Patent No. 6607879 

GENERAL INFORMATION: 

APPLICANT: Cocks, Benjamin G. 

APPLICANT: Susan G. Stuart 

APPLICANT: Jeffrey J. Seilhamer 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF BLOOD CELL GENE 

TITLE OF INVENTION: EXPRESSION 
; NUMBER OF SEQUENCES: 1508 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

; STREET: 3174 PORTER DRIVE 

CITY: PALO ALTO 

STATE: CALIFORNIA 

COUNTRY: USA 

ZIP : 94304 
COMPUTER READABLE FORM: 



MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Word Perfect 6.1 for Windows/MS-DOS 6.2 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/023,655 
FILING DATE: HEREWITH 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 
NAME : Zeller, Karen J. 
REGISTRATION NUMBER: 37,071 
REFERENCE/ DOCKET NUMBER: PA- 0001 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (650) 855-0555 
TELEFAX: (650) 845-4166 
INFORMATION FOR SEQ ID NO: 1417: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1973 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 
LIBRARY: GENBANK 
CLONE: g471316 
US-09-023-655-1417 

Query Match 5.3%; Score 82.2; DB 4; Length 1973; 

Best Local Similarity 44.5%; Pred. No. 4e-13; 

Matches 379; Conservative 0; Mismatches 463; Indels 9; Gaps 1; 

Qy 8 5 GGCT AT CT T GAATAAGT ACT AC CT CT C T GCATT T TAT GCAAT C GAGT T CAT T TT T GGACT 144 

I I I I I I M I I I I I I I I I I I I I I I I I 

Db 315 GCCCCTCGGGCTCAAGGTCACCATCGTGGGGCTCTACCTGGCCGTGTGTGTCGGAGGGCT 374 

Qy 145 GCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAG 204 

I I I I I I I I III I I I I I I I I I I II III II I I 

Db 375 CCTGGGGAACTGCCTTGTCATGTACGTCATCCTCAGGCACACCAAAATGAAGACAGCCAC 4 34 

Qy 205 CAATGTCTATCTTTTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCAT 264 

I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I II I 

Db 435 CAATATTTACATCTTTAACCTGGCCCTGGCCGACACTCTGGTCCTGCTGACGCTGCCCTT 494 

Qy 265 CCT GATAAAGAGT TAT GC CAAT GATAAGGGGACCTAT GGAGAT GTT CT CT GTATAAGCAA 324 

III II I I I I I I I I I I I I I I I I I I 

Db 495 CCAGGGCACGGACATCCTCCTGGGCTTCTGGCCGTTTGGGAATGCGCTGTGCAAGACAGT 554 

Qy 325 CC G AT AT GT GCT T C ACAC CAAC C T CT AC AC C AG CAT CCTCTTCCT C AC T T T CAT T AGCAT 384 

I I II I I I I I I i I I I I I I I I I I I II I I I I I I I I I 

Db 555 CAT T G C CAT T GACT ACT ACAAC AT GTT CAC C AG CAC CT T C AC C C T AACT G C CAT GAGT GT 614 

Qy 38 5 GG AC C GAT AT C T G CT CAT GAAGT AC C CT T T C C GAGAAC ACT T T CT ACAAAAGAAGGAATT 4 44 

I I I I I I I I I III MM I I I I I I I II II 

Db 615 GGATCGCTATGTAGCCATCTGCCACCCCATCCGTGCCCTCGACGTCCGCACGTCCAGCAA 674 



Qy 445 TGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCT 5 04 

Ml II I I I I I I I I I II I I I I II I I I III I 

Db 675 AGCCCAGGCTGTCAATGTGGCCATCTGGGCCCTGGCCTCTGTTGTCGGTGTTCCCGTTGC 734 

Qy 505 CACTTTCATCAATTCTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTC 564 

II I I I I I I I I I I I I I I I I I II 

Db 735 CAT CAT G GG CT C GGCAC AGGT C GAG GAT GAAGAGAT C G AGT G C CT GGT GGAGAT C C C T AC 794 

Qy 565 TGGAAACCCTGAACACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAAT 624 

I II I II I I I I I I I III I I! I I I 

Db 795 CCCTCAGGATTAGTGGGGCCCGGTGTTTGCCATCTGCATCTTCCTCTTCTCCTTCATCGT 854 

Qy 625 TCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAG 684 

III I I I I I I I 1 I I I I I I I I I I I I 

Db 855 CCCCGTGCTCGTCATCTCTGTCTGCTACAGCCTCATGATCCGGCGGCTCCGTGGAGTCCG 914 

Qy 68 5 CCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGT 744 

I I I I I I I I I I I I I I I I I I I I I 

Db 915 CCTGCTCTCGGGCTCCCGAGAGAAGGACCGGAACCTGCGGCGCATCACTCGGCTGGTGCT 974 

Qy 745 T GT GAT CT T CT C TAT AC T CT T C ACACC C TAT CAT AT CAT GC GCAATT T GAG GAT C GC CT C 804 

I I I I I I I I I I I II II II I I I I I 

Db 975 GGTGGTAGTGGCTGTGTTCGTGGGCTGCTGGACGCCTGTCCAGGTCTTCGTGCTGGCCCA 1034 

Qy 805 AC G C C T GGATAGT T G GC C ACAAGGAT GT AC ACAGAAGG C CAT CAAAT CT AT AT AC AC ACT 864 

I I I I I I I I I I I II I I I I I I I I I I I 
Db 1035 AGGGCTGGGGGTTCAGCCGAGCAGCGAGACTGCCGTGGCCATTCTGCGCTTCTGCAC 1091 

Qy 8 65 GACACGGCCTCTGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGG 924 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1092 GGCCCTGGGCTACGTCAACAGCTGCCTCAACCCCATCCTCTACGCCTTCCTGGA 1145 

Qy 925 AGACCATTACA 935 

II I I I I 

Db 114 6 TGAGAACTTCA 115 6 



RESULT 11 
US-09-976-594-171 

; Sequence 171, Application US/09976594 

; Patent No. 6673549 

; GENERAL INFORMATION: 

; APPLICANT: Furness, Michael 

; APPLICANT: Buchbinder, Jenny 

TITLE OF INVENTION: GENES EXPRESSED IN C3A LIVER CELL CULTURES TREATED WITH 
STEROIDS 

FILE REFERENCE: PA-0041 US 
; CURRENT APPLICATION NUMBER: US/ 0 9/ 97 6, 5 94 
; CURRENT FILING DATE: 2001-10-12 
; PRIOR APPLICATION NUMBER: 60/240,409 

PRIOR FILING DATE: 2000-10-12 
; NUMBER OF SEQ ID NOS : 1143 
; SOFTWARE: PERL Program 
; SEQ ID NO 171 
; LENGTH: 3205 
TYPE: DNA 



ORGANISM: Homo sapiens 
FEATURE : 

NAME/ KEY: misc_feature 

OTHER INFORMATION: Incyte ID No. 6673549 222181.1 
US-09-976-594-171 

Query Match 5.3%; Score 82.2; DB 4; Length 3205; 

Best Local Similarity 44.5%; Pred. No. 5.3e-13; 

Matches 379; Conservative 0; Mismatches 463; Indels 9; Gaps 1; 

Qy 85 GGCTATCTTGAATAAGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACT 14 4 

I I I I I I I I I I I I I I II II I I I I I I I 

Db 38 9 GCCCCTCGGGCTCAAGGTCACCATCGTGGGGCTCTACCTGGCCGTGTGTGTCGGAGGGCT 44 8 

Qy 14 5 GCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAG 204 

I I I I I I I I III II I II I I I I I II III II II 

Db 449 CCT GGGGAACT GC CTT GT CAT GT AC GT CAT C CT CAGGCACACCAAAAT GAAGACAGC CAC 508 

Qy 2 05 CAATGTCTATCTTTTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCAT 264 

I I I I I I I l.hlhl III I II I I I I I I I I I I I I I I I 

Db 509 CAATATTTACATCTTTAACCTGGCCCTGGCCGACACTCTGGTCCTGCTGACGCTGCCCTT 568 

Qy 265 CCT G AT AAAGAGT TAT G C C AAT GAT AAG G G GAC CT AT G GAGAT GT TCT C T GT AT AAGCAA 324 

III II I I I I I I I I I I I I I I I I I I 

Db 569 CCAGGGCACGGACATCCTCCTGGGCTTCTGGCCGTTTGGGAATGCGCTGTGCAAGACAGT 628 

Qy 325 CC GAT AT GT G CT T CAC AC CAAC CT CTAC AC CAGC AT CCTCTTCCT CACT T T CAT TAG CAT 384 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 629 CAT T GC CAT T GAC T AC T ACAACAT GT T CAC CAGC AC CT T CAC C CTAACT GC CAT GAGT GT 688 

Qy 385 GGAC C GAT AT CT G CT CAT GAAGT AC C CT T T C C GAGAAC ACT T T CTACAAAAGAAG GAAT T 44 4 

I I I I I I I I I III I I I I I I I I I II II II 

Db 689 GGATCGCTATGTAGCCATCTGCCACCCCATCCGTGCCCTCGACGTCCGCACGTCCAGCAA 74 8 

Qy 445 TGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTT CTAC C CAT GCT 504 

III II I I I I I I I I I I I I I I I II I I I I I I I 

Db 749 AGCCCAGGCTGTCAATGTGGCCATCTGGGCCCTGGCCTCTGTTGTCGGTGTTCCCGTTGC 80 8 

Qy 505 CACT T T CAT CAAT TCT GT CC CAAAAGAAGAGGGC AGTAACT GCAT C GAC TAT GCAAGT T C 564 

II I I I I I I I I I I I I I I I I I II 

Db 809 CAT CAT GGGCT C GGCACAGGT CGAGGAT GAAGAGAT C GAGTGC CT GGT GGAGATCC CTAC 868 

Qy 565 TGGAAACCCTGAACACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAAT 624 

I II I II I I I I I I I I II I I I I I I 

Db 8 69 CCCTCAGGATTACTGGGGCCCGGTGTTTGCCATCTGCATCTTCCTCTTCTCCTTCATCGT 92 8 

Qy 625 T CCT CT CT CT GT GAT GT GCTT CT T CT ACTACAAGAT GGTAGT CTT CTT AAAGAGGAGGAG 684 

III I I I I I I I I I I I I I I I I I I I I 

Db 929 CCCCGTGCTCGTCATCTCTGTCTGCTACAGCCTCATGATCCGGCGGCTCCGTGGAGTCCG 98 8 

Qy 685 CCAGCAGCAAGCAACTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGT 744 

I I I I I I I I I I I I II I I I II I I 

Db 989 CCTGCTCTCGGGCTCCCGAGAGAAGGACCGGAACCTGCGGCGCATCACTCGGCTGGTGCT 1048 

Qy 745 T GT GAT CT T CT CT AT AC T CT T CAC AC C C TAT CAT AT CAT GC G CAATT T GAGG AT C G C C T C 804 

M I I I Ml III II II II I I I I I 

Db 104 9 GGTGGTAGTGGCTGTGTTCGTGGGCTGCTGGACGCCTGTCCAGGTCTTCGTGCTGGCCCA 1108 



8 05 AC G C C T G GAT AGT T G G C C AC AAG GAT GT AC AC AGAAGG C CAT CAAAT CT AT AT ACAC AC T 8 64 

I I I I I I I I I I I II I I I I I I I I I I I 
1109 AGGGCTGGGGGTTCAGCCGAGCAGCGAGACTGCCGTGGCCATTCTGCGCTTCTGCAC 1165 

Qy 8 65 GACACGGCCTCTGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGG 924 

I I I I II I I I I I I I I I I I I I I II I II I I I I I I I I I I I 
Db 1166 GGCCCTGGGCTACGTCAACAGCTGCCTCAACCCCATCCTCTACGCCTTCCTGGA 1219 

Qy 925 AGACCATTACA 935 

II I I I I 
Db 1220 TGAGAACTTCA 1230 



Qy 

Db 



RESULT 12 
US-09-023-655-992 

; Sequence 992, Application US/09023655 

; Patent No. 6607879 

; GENERAL INFORMATION: 

; APPLICANT: Cocks, Benjamin G. 

; APPLICANT: Susan G. Stuart 

; APPLICANT: Jeffrey J. Seilhamer 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF BLOOD CELL GENE 

TITLE OF INVENTION: EXPRESSION 
; NUMBER OF SEQUENCES: 1508 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

; STREET: 3174 PORTER DRIVE 

; CITY: PALO ALTO 

; STATE: CALIFORNIA 

COUNTRY: USA 

ZIP: 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE: Word Perfect 6.1 for Windows/MS-DOS 6.2 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/09/023, 655 

FILING DATE: HEREWITH 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 
; FILING DATE: 

CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 

NAME: Zeller, Karen J. 

REGISTRATION NUMBER: 37,071 
; REFERENCE/DOCKET NUMBER: PA-0001 US 

; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: (650) 855-0555 

TELEFAX : (650) 845-4166 
; INFORMATION FOR SEQ ID NO: 992: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1158 base pairs 

; TYPE: nucleic acid 

STRANDEDNESS: single 



TOPOLOGY: linear 
IMMEDIATE SOURCE: 
LIBRARY: GENBANK 
CLONE: gl668735 
US-09-023-655-992 

Query Match 5.2%; Score 80; DB 4; Length 1158; 

Best Local Similarity 47.3%; Pred. No. 1.2e-12; 

Matches 276; Conservative 0; Mismatches 305; Indels 3; Gaps 1; 

Qy 98 AAGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTC 157 

I I I I III III I I I I I I II II I I I I I I I I I I I I I I 

Db 171 AAGTTGCTCCTTGCTGTCTTTTATTGCCTCCTGTTTGTATTCAGTCTTCTGGGAAACAGC 230 

Qy 158 ACT GT GGT GTT C GGCTAC CT CT T CT GCAT GAAGAACT GGAACAGCAGCAAT GT CT AT CTT 217 

II I III I I I I I I I I I I I I I I I I I I tilt It II 

Db 231 C T GGT CAT CCTGGTCCTT GT G GT CT GCAAGAAGCT GAGGAG CAT CAC AGAT GT AT AC CT C 290 

Qy 218 TTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGT 277 

I I I I I I I I I I I I I II I I III I I Ml I III I I II I 
Db 291 TTGAACCTGGCCCTGTCTGACCTGCTTTTTGTCTTCTCCTTCCCCTTTCAGACCTA C 347 

Qy 278 T ATGCCAAT GATAAGGGGACCT AT GGAGAT GTT CT CT GTATAAGCAACCGATAT GT GCTT 337 

III I I I I I I I III I II I II I I I I 

Db 348 TATCTGCTGGACCAGTGGGTGTTTGGGACTGTAATGTGCAAAGTGGTGTCTGGCTTTTAT 4 07 

Qy 338 C AC ACCAAC CT CT ACAC CAGCAT CCTCTTCCT C ACTT TC AT TAG CAT G GAC C GAT AT C T G 397 

III I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I 

Db 4 08 T AC ATT GG CT T CT ACAG CAGCAT GT T T T T CAT C AC CCT C AT GAGT GT G GACAGGT AC CT G 4 67 

Qy 398 CT CAT GAAGT AC C CT T T C C GAGAACACT T T C T ACAAAAGAAGGAAT T T GC CAT TT TAAT C 457 

I III II I III I I I I I 

Db 4 68 GCTGTTGTCCATGCCGTGTATGCCCTAAAGGTGAGGACGATCAGGATGGGCACAACGCTG 527 

Qy 458 TCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAAT 517 

I I I I I I I I I I I II II I I I I I II II II 

Db 52 8 TGCCTGGCAGTATGGCTAACCGCCATTATGGCTACCATCCCATTGCTAGTGTTTTACCAA 587 

Qy 518 TCTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAA 577 

I I I I I I I I I III I I I I I I 

Db 588 GT GG CC T C T GAAGAT GGT GT T C T AC AGT GT TAT T C AT TT T AC AAT C AACAGACT T T GAAG 647 

Qy 57 8 CACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTG 637 

II III I I I I I III I I I I I I I I I I I I I III I 

Db 64 8 T G GAAGAT CTT CAC CAACT T CAAAAT GAACAT T TT AG GCT T GT T GAT C C CAT T CAC CAT C 707 

Qy 638 AT GT GCTT CTT CTACTACAAGAT GGT AGT CTT CTTAAAGAGGAG 681 

I I I I I I I I I I I II I I I I I I I I I I 

Db 708 TTTATGTTCTGCTACATTAAAATCCTGCACCAGCTGAAGAGGTG 751 



RESULT 13 
US-08-461-244-1 

; Sequence 1, Application US/08461244 
; Patent No. 5776729 
; GENERAL INFORMATION: 

APPLICANT: Soppet, Daniel R. 



; APPLICANT: Yi, Li 

; APPLICANT: Ruben, Steven M. 

; APPLICANT: Rosen, Craig A. 

TITLE OF INVENTION: HUMAN G-PROTEIN RECEPTOR HGBER32 
NUMBER OF SEQUENCES: 7 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: CARELLA, BYRNE, BAIN, GILFILLAN, CECCHI, 
ADDRESSEE: STUART & OLSTEIN 
; STREET: 6 Becker Farm Road 

CITY: Roseland 
; STATE: New Jersey 

COUNTRY: USA 
ZIP : 07068 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
; CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/461, 244 

; FILING DATE: 05-JUN-1995 

CLASSIFICATION: 536 
ATTORNEY/ AGENT INFORMATION: 
; NAME: Ferraro, Gregory D. 

REGISTRATION NUMBER: 36,134 
REFERENCE/DOCKET NUMBER: 325800-445 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 201-994-1700 
TELEFAX: 201-994-1744 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 1586 base pairs 

TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
; FEATURE : 

NAME/ KEY: CDS 
LOCATION: 431.. 1495 
US-08-461-244-1 

Query Match 5.2%; Score 80; DB 1; Length 1586; 

Best Local Similarity 47.3%; Pred. No. 1.5e-12; 

Matches 276; Conservative 0; Mismatches 305; Indels 3; Gaps 1; 

Qy 98 AAGT AC T AC C T C T C T GC AT T T TAT GCAAT C GAGT T CAT T TTT GGACT GC T T GGGAAT GT C 157 

I I I I III III I I I I I I II III III II II M I I 

Db 533 AAGTTGCTCCTTGCTGTCTTTTATTGCCTCCTGTTTGTATTCAGTCTTCTGGGAAACAGC 592 

Qy 158 ACTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTT 217 

II I III I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 593 CTGGTCATCCTGGTCCTTGTGGTCTGCAAGAAGCTGAGGAGCATCACAGATGTATACCTC 652 



Qy 

Db 



218 TTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGT 277 

I I I II II I I I I I I I I I I III I I I I I I I I I I I I I I 
653 TTGAACCTGGCCCTGTCTGACCTGCTTTTTGTCTTCTCCTTCCCCTTTCAGACCTA C 709 



Qy 278 TAT GC CAAT GAT AAGG G GAC CT AT GGAG AT GT T C T CT GT ATAAG CAAC C GAT AT GT G CT T 337 

III I I I I I I I I I I III I II I I I I 

Db 710 TATCTGCTGGACCAGTGGGTGTTTGGGACTGTAATGTGCAAAGTGGTGTCTGGCTTTTAT 769 

Qy 338 C AC AC CAAC C T C T AC AC C AGC AT CCTCTTCCT C ACT T T CAT T AGC AT G GACC GATAT C T G 397 

III I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I I I 

Db 770 T ACAT T GGC T T C T AC AGC AG CAT GT TT T T CAT CAC C CT CAT GAGT GT GGAC AGGT AC C T G 82 9 

Qy 398 C T CAT GAAGT AC C C T T T C C GAGAAC ACT T T CT ACAAAAGAAGGAAT T T GC CAT T TTAAT C 457 

I III II I III I I I I I 

Db 830 GCT GT T GT C CAT GC C GT GT AT GC C CT AAAG GT GAGGAC GAT C AG GAT GGGC ACAAC G C T G 88 9 

Qy 458 TCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAAT 517 

I I I I I I I I I I I II II I I I I I I I II II 

Db 8 90 TGCCTGGCAGTATGGCTAACCGCCATTATGGCTACCATCCCATTGCTAGTGTTTTACCAA 94 9 

Qy 518 T CT GT C C CAAAAGAAGAGGGC AGTAACT G CAT C GACT AT G CAAGT T C T GGAAAC C CT GAA 577 

I I I I I I I I I III I I I I I I 

Db 950 GT GGC C T CT GAAGAT GGT GT T C T AC AGT GT TAT T CAT T T T ACAAT C AAC AGACTTT GAAG 1009 

Qy 57 8 CACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTG 637 

II III I I I I I III I I I I I I I I I I I I I III I 

Db 1010 T GGAAGAT CTT C AC CAAC T T CAAAAT GAAC AT T T T AGGC T T GT T GAT C C CAT T CAC CAT C 1069 

Qy 638 AT GT GCTT CTT CT ACTACAAGAT GGTAGT CTT CTTAAAGAGGAG 681 

I I I I I I I I I I I I I I I I I I I I I I I 

Db 1070 TT T AT GT T C T GC T AC AT TAAAAT C CT GC AC C AGCT GAAGAG GT G 1113 



RESULT 14 

US-09-016-434-1096 

; Sequence 1096, Application US/09016434 

; Patent No. 6500938 

; GENERAL INFORMATION: 

; APPLICANT: Janice Au-Young 

; APPLICANT: Jeffrey J. Seilhamer 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF SIGNALING 
; TITLE OF INVENTION: PATHWAY GENE EXPRESSION 

; NUMBER OF SEQUENCES: 1490 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

STREET: 3174 PORTER DRIVE 

CITY: PALO ALTO 
; STATE: CALIFORNIA 

; COUNTRY: USA 

ZIP: 94304 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE : Word Perfect 6.1 for Windows /MS-DOS 6.2 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 9/ 016 , 4 34 

FILING DATE: HEREWITH 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 



FILING DATE: 
CLASSIFICATION: 
ATTORNEY/ AGENT INFORMATION: 
NAME: Zeller, Karen J. 
REGISTRATION NUMBER: 37,071 
REFERENCE/ DOCKET NUMBER: PA- 00 02 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (650) 855-0555 
TELEFAX: (650) 845-4166 
INFORMATION FOR SEQ ID NO: 1096: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1953 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 
LIBRARY: GENBANK 
CLONE: gl245056 
US-09-016-434-1096 

Query Match 5.2%; Score 80; DB 4 ; Length 1953; 

Best Local Similarity 47.3%; Pred. No. 1.7e-12; 

Matches 276; Conservative 0; Mismatches 305; Indels 3; Gaps 1; 

Qy 98 AAGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTC 157 

I I I I III III I I I I I I II III III I I I I I I I I I I 

Db 369 AAGTTGCTCCTTGCTGTCTTTTATTGCCTCCTGTTTGTATTCAGTCTTCTGGGAAACAGC 42 8 

Qy 158 ACTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTT 217 

II I III I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 42 9 CT GGT CAT C CT GGT CCTTGTGGTCT GCAAGAAGCT GAG GAG CAT C AC AGAT GTAT AC CT C 488 

Qy 218 TTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGT 277 

I I I I I I I I I I I I I I I I I III I I I II I III I I II I 
Db 489 TTGAACCTGGCCCTGTCTGACCTGCTTTTTGTCTTCTCCTTCCCCTTTCAGACCTA C 545 

Qy 278 TAT G C CAAT GAT AAG GGGAC CT AT GGAGAT GT T CT CT GT AT AAGCAAC C GAT AT GT GC TT 337 

III II I I I I I I I I III I II I I I I 

Db 546 TATCTGCTGGACCAGTGGGTGTTTGGGACTGTAATGTGCAAAGTGGTGTCTGGCTTTTAT 605 

Qy 338 C AC AC CAAC CT CT AC AC CAGCAT CCTCTTCCT C AC TT T CAT TAG CAT GGAC C GAT AT CT G 397 

III I I I I I I I I I I I II I III I I I I II I I II I I I I I I II III 

Db 606 T ACAT T GGCT T CT AC AG CAGCAT GT T T T T CAT C AC C CT C AT GAGT GT GGAC AGGT AC CT G 665 

Qy 398 CTCATGAAGTACCCTTTCCGAGAACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATC 4 57 

I III II I III I I I I I 

Db 666 GCTGTTGTCCATGCCGTGTATGCCCTAAAGGTGAGGACGATCAGGATGGGCACAACGCTG 725 

Qy 458 TCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAAT 517 

I I I I I I I I I I I II II I I I I I I I II II 

Db 72 6 TGCCTGGCAGTATGGCTAACCGCCATTATGGCTACCATCCCATTGCTAGTGTTTTACCAA 785 

Qy 518 T CT GT C C CAAAAGAAGAG GG CAGT AAC T GC AT C GACT AT G CAAGT T CT GGAAAC C CT GAA 577 

I I I II I I I I III I I I I I I 

Db 786 GT G G C C T CT GAAGAT GGT GT T C T ACAGT GT TAT T C AT TT TACAAT CAAC AGACT T T GAAG 845 



Qy 



57 8 CACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTG 637 



Db 84 6 T G G AAG AT C T T C AC C AAC T T C AAAAT GAAC AT T T T AGG C T T GT T GAT C C CAT T C AC CAT C 9 05 

Qy 638 AT GTGCTTCTTC T ACT AC AAGAT G GT AGT CT T CT T AAAGAG GAG 681 

I I I II I I I I I I I I I I I I I I I I I I 

Db 90 6 TTTATGTTCTGCTACATTAAAATCCTGCACCAGCTGAAGAGGTG 94 9 



RESULT 15 
US-09-023-655-955 

; Sequence 955, Application US/09023655 

; Patent No. 6607879 

; GENERAL INFORMATION: 

; APPLICANT: Cocks, Benjamin G. 

; APPLICANT: Susan G. Stuart 

; APPLICANT: Jeffrey J. Seilhamer 

; TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF BLOOD CELL GENE 

; TITLE OF INVENTION: EXPRESSION 

NUMBER OF SEQUENCES: 1508 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

; STREET: 3174 PORTER DRIVE 

; CITY: PALO ALTO 

; STATE: CALIFORNIA 

; COUNTRY: USA 

; ZIP : 94304 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Word Perfect 6.1 for Windows /MS-DOS 6.2 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 9/ 02 3 , 655 

FILING DATE: HEREWITH 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 

CLASSIFICATION: 
; ATTORNEY/AGENT INFORMATION: 

NAME: Zeller, Karen J. 

REGISTRATION NUMBER: 37,071 

REFERENCE/DOCKET NUMBER: PA-0001 US 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (650) 855-0555 
; TELEFAX: (650) 845-4166 

; INFORMATION FOR SEQ ID NO: 955: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 2608 base pairs 

; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
; IMMEDIATE SOURCE: 

; LIBRARY: GENBANK 

CLONE: gl468978 
US-09-023-655-955 



Query Match 5.2%; Score 80; DB 4; Length 2608; 

Best Local Similarity 47.3%; Pred. No, 2e-12; 

Matches 276; Conservative 0; Mismatches 305; Indels 3; Gaps 



1; 



Qy 98 AAGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTC 157 

I I II III III I I I I II II IN IN I II II II II I 

Db 4 63 AAGTTGCTCCTTGCTGTCTTTTATTGCCTCCTGTTTGTATTCAGTCTTCTGGGAAACAGC 522 

Qy 158 ACTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTT 217 

II I III I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 523 CT GGT CAT C CT GGT C CTT GT GGT CT GCAAGAAGCT GAGGAGCAT CACAGAT GTATAC CTC 582 

Qy 218 TTTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGT 277 

I I I I I I I I.IMM III I I I I I I I I I I I I I I 
Db 583 TTGAACCTGGCCCTGTCTGACCTGCTTTTTGTCTTCTCCTTCCCCTTTCAGACCTA C 63 9 

Qy 278 TAT GC CAAT G AT AAGG GGAC CT AT GGAGAT GT T CT CT GT AT AAG CAAC C GAT AT GT GCTT 337 

III I I I I I I I I I I I I I I I I I I I I 

Db 64 0 TATCTGCTGGACCAGTGGGTGTTTGGGACTGTAATGTGCAAAGTGGTGTCTGGCTTTTAT 699 

Qy 338 C AC AC CAAC CT CT AC AC C AGCAT CCTCTTCCT CAC T T T CAT T AGCAT G GAC CGAT AT C T G 397 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 700 T AC AT T GG C T T C T AC AG C AGCAT GT TT T T CAT CAC C CT CAT GAGT GT GGAC AG GT AC CT G 759 

Qy 398 C T CAT GAAGT AC C CT T T C C GAGAAC ACT T T CT ACAAAAGAAG GAAT TT G C C AT TT TAAT C 457 

I III II I III I I I I I 

Db 760 G CT GT T GT C CAT G C C GT GT AT GC C C TAAAGGT GAGGAC GAT C AG GAT GG GC ACAAC GCT G 819 

Qy 458 TCGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAAT 517 

I I I I I I I I I I I II II I I I I I I I II II 

Db 82 0 TGCCTGGCAGTATGGCTAACCGCCATTATGGCTACCATCCCATTGCTAGTGTTTTACCAA 879 

Qy 518 T CT GT C CCAAAAGAAGAGGG CAGTAACT G CAT C GACT AT GCAAGT T CT G GAAAC C CT GAA 577 

III I I I I I I III I I I I I I 

Db 8 80 GT GGC C T C T GAAGAT GGT GT T CT AC AGT GT T ATT C AT T T T AC AAT CAAC AGACT T T GAAG 939 

Qy 578 CACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTG 637 

II III I I I I I III I I I I I I I I I I I I I III I 

Db 94 0 T G GAAGAT CTT CAC CAACT T CAAAAT GAACAT TT TAGGC TT GT T GAT C C C ATT CAC CAT C 999 

Qy 638 AT GT GCTT CTT CT ACT ACAAGAT GGT AGT CTT CTTAAAGAGGAG 681 

I I I I I I I I I I I I I I I I I I I I I I I 

Db 1000 TTTATGTTCTGCTACATTAAAATCCTGCACCAGCTGAAGAGGTG 1043 



Search completed: August 24, 2004, 16:05:17 
Job time : 128 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on : 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



August 24, 2004, 14:51:16 ; Search time 749 Seconds 

(without alignments) 
10119.388 Million cell updates/sec 

US-09-891-138A-1 
1543 

1 gctcctggcagagttttctg tgcctaaataaatcaatata 1543 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



3228839 seqs, 2456066551 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



6457678 



Database : 



Published_Applications__NA: * 

1 : /cgn2_6/ptodata/2/pubpna/US07_PUBCOMB. seq: * 

2 : /cgn2_6/ptodata/2/pubpna/PCT_NEW_PUB. seq: * 

3: /cgn2_6/ptodata/2/pubpna/US06_NEW_PUB.seq:* 

4 : /cgn2_6/ptodata/2/pubpna/US06_PUBCOMB. seq: * 

5: /cgn2_6/ptodata/2/pubpna/US07_NEW_PUB.seq:* 

6 : / cgn2_6/ptodata/ 2 /pubpna/ PCTUS_PUBCOMB .seq:* 

7 : /cgn2_6/ptodata/2/pubpna/US08_NEW_PUB . seq: * 

8: /cgn2_6/ptodata/2/pubpna/US08_PUBCOMB.seq:* 

9: /cgn2_6/ptodata/2/pubpna/US09A_PUBCOMB.seq: * 
10: /cgn2_6/ptodata/2/pubpna/US09B_PUBCOMB.seq:* 
11: /cgn2_6/ptodata/2/pubpna/US09C_PUBCOMB.seq:* 
12: /cgn2_6/ptodata/2/pubpna/US09_NEW_PUB.seq:* . 
13: /cgn2_6/ptodata/2/pubpna/US09_NEW_PUB. seq2 : * 
14: /cgn2_6/ptodata/2/pubpna/USlOA_PUBCOMB. seq: * 
15 : /cgn2_6/ptodata/2/pubpna/US10B_PUBCOMB. seq: * 
16: /cgn2_6/ptodata/2/pubpna/US10C_PUBCOMB. seq: * 
17: /cgn2_6/ptodata/2/pubpna/US10_NEW_PUB. seq: * 
18 : /cgn2_6/ptodata/2/pubpna/U'S60_NEW_PUB. seq: * 
19: /cgn2_6/ptodata/2/pubpna/US60_PUBCOMB. seq: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 

US-09-891-138A-1 

; Sequence 1, Application US/09891138A 
; Publication No. US20030083245A1 
; GENERAL INFORMATION: 



APPLICANT: Lin, Daniel Chi-Hong 
APPLICANT: Zhao, Jiagang 
APPLICANT: Chen, Jin-Long 
APPLICANT: Cutler, Gene 
APPLICANT: Tularik Inc. 

TITLE OF INVENTION: No. US20030083245Alel Receptors 
FILE REFERENCE: 018781-0062 10US 
CURRENT APPLICATION NUMBER: US/09/891, 138A 
CURRENT FILING DATE: 2001-06-25 
PRIOR APPLICATION NUMBER: US 60/213,461 
PRIOR FILING DATE: 2000-06-23 
NUMBER OF SEQ ID NOS : 26 
SOFTWARE: Patentln Ver . 2.1 
SEQ ID NO 1 
LENGTH: 1543 
TYPE: DNA 

ORGANISM: Mus mus cuius 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: (44) . . (997) 

OTHER INFORMATION: mouse TGR18 G-protein coupled receptor (GPCR) 
US-09-891-138A-1 

Query Match 100.0%; Score 1543; DB 10; Length 1543; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1543; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 GCTCCTGGCAGAGTTTTCTGTCGAGACAGAAGCCGACAGCAGAATGGCACAGAATTTATC 60 

I I I II I I I I I I I I I I I I 1 I I I I I I I I I I ! I I I I I I I I I I I I I II I I I I I I M I I I I I I I I 
Db 1 GCT C CT GGC AGAGT T T T CT GT C GAGAC AGAAGC CGAC AG C AGAAT GGC AC AGAAT T TAT C 60 

Qy 61 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 120 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I 
Db 61 TTGTGAGAATTGGTTGGCAACAGAGGCTATCTTGAATAAGTACTACCTCTCTGCATTTTA 120 

Qy 121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 180 

I I I I I I I I I M I I I M I I I I I I I I I M I I I I I I I I I I I I I I II I I II I M I I I I I I I I I I 
Db 121 TGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCACTGTGGTGTTCGGCTACCTCTT 180 

Qy 181 CT GC AT GAAGAACT G GAAC AGCAGCAAT GT CT AT CT T T T TAAC CT TT C C AT CT CT GAC T T 240 

I I I II II I I I I II I I I I I I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 181 CT GCAT GAAGAACT GGAAC AGCAGCAAT GT C TAT CT T T T TAAC CT T T C C AT CT CT GAC T T 240 

Qy 241 TGCTTTCCTGT GC AC C CT T C C CAT C CT GAT AAAGAGT TAT G C CAAT GAT AAGG GGAC CT A 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I II I I M I I I I I I M II I I 
Db 241 TGCTTTCCTGT GC AC C CT T C C CAT C C T GAT AAAGAGT TAT GC CAAT GAT AAGG G GAC C T A 300 

Qy 301 T GGAGAT GT T C T C T GT ATAAGCAAC C GAT AT GT GCT T C AC AC CAAC C T CT AC AC CAG CAT 360 

I I I I I II I I II I II I I I II I I I I I I I II I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 T G GAGAT GT T CT CT GT AT AAG CAAC C GAT AT GT GCT T C ACAC CAAC C T CT AC AC C AGC AT 360 

Qy 361 CCTCTTCCT C AC T T T CAT TAG CAT GGAC C GAT AT C T G CT CAT GAAGT AC C CT T T C CGAGA 420 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I II I I I I I I I I I I I I M I I I I I I I II I 
Db 361 CCTCTTCCT CACT T T CAT T AGC AT GGAC C GAT AT C T G C T CAT GAAGT AC C CT T T C CGAGA 420 



QY 



421 ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 4 80 
I I I I I I I I I I I I II I I I I II I I I I II I II I II I I I I I I I I I I I I I I I I I M I I I I I I I I I 



Db 



421 ACACTTTCTACAAAAGAAGGAATTTGCCATTTTAATCTCGCTGGCTGTCTGGGCCTTAGT 48 0 



Qy 481 GAC CT T AGAAGT T CT AC C CAT G CT C ACTT T CAT CAATT C T GT C C CAAAAGAAGAGGG C AG 54 0 

I I I I I II I I I I I I I I I I I I I I II I I I I I 1 II I I I I I I I I I I I I I I M i I I I I I I I I I I M 
Db 481 GAC C T T AGAAGT T CT AC C CAT G CT C ACTT T CAT CAAT T C T GT C C CAAAAGAAGAGGG C AG 540 

Qy 541 T AAC T G CAT C GAC TAT G CAAGT T C T G GAAAC C CT GAAC ACAAT C T CAT T T AC AGC CT C T G 600 

I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 541 T AAC T G CAT C GAC TAT G CAAGT T C T G GAAAC C CT GAACACAAT C T CAT T T AC AGC CT C T G 600 

Qy 601 CCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGAT 660 

I I I I I I I I I I I I I I I I I 1 II I I I I I I I II II I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
Db 601 CCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGATGTGCTTCTTCTACTACAAGAT 660 

Qy 661 GGTAGT CTT CT TAAAGAGGAGGAGC CAGCAGCAAGCAACT GC C CT GC C ACT GGACAAAC C 720 

I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I II I I II II I I I 
Db 661 GGTAGT CTT CT TAAAGAGGAGGAGC CAGCAGCAAGCAACT GC C CT GC CACT GGACAAAC C 720 

Qy 721 CCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTTCACACCCTATCATAT 780 

I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 721 CCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTATACTCTTCACACCCTATCATAT 78 0 

Qy 781 CAT GC G CAAT T T GAGGAT CGC CT C AC GC CT GGAT AGTT GGC C ACAAGGAT GT AC AC AGAA 84 0 

I I I I I I I I I I I I I I I I I II II I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 7 81 CAT G C G CAAT T T GAGGAT CG C CT CAC GC CT GGAT AGTT GGC C ACAAGGAT GT AC AC AGAA 84 0 

Qy 841 GGCCATCAAATCTATATACACACTGACACGGCCTCTGGCCTTTCTGAACAGTGCCATCAA 900 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 841 GGCCATCAAATCTATATACACACTGACACGGCCTCTGGCCTTTCTGAACAGTGCCATCAA 900 

Qy 901 T C C CAT C T T CT ACT T C C T CAT G GGAGACC AT T AC AGAGAGAT G C T GAT T AGT AAGTT C AG 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M M I I I I 
Db 901 TCCCAT CTT CT ACTT CCT CAT GGGAGACCATTACAGAGAGAT GCT GATT AGTAAGTT CAG 960 

Qy 961 ACAAT AC T T CAAGT C C CT T ACAT C CTT CAG GAC AT GAGCT GCT GGAT GC AG GT CT T CAC T 1020 

I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 961 ACAAT ACT T CAAGT C C CT T ACAT C CTT CAGGAC AT GAGCT GCT GGAT GC AG GT CTT CACT 1020 

Qy 1021 CAG C CAAAAT GAGAC ACT T GAT AAACAGT GCT GT GCAGT T GAGT T TT AACTAAGT AAAC C 1080 

I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1021 CAGCCAAAATGAGACACTTGATAAACAGTGCTGTGCAGTTGAGTTTTAACTAAGTAAACC 108 0 

Qy 1081 ACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTG 114 0 

I I I I I I I I I I I II I I I I II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 ACCATTTCTAGGCTTTAGCTTTCCACCATCCTCCAACCCCCAGGGCTGGAGTACAAGCTG 114 0 

Qy 1141 GGT C CAC AT GAAT CAGAAG GCAGCT CT CT GT T CT GATT T TAG GT TAT AC C C AGAGT AT G G 12 00 

I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 GGTCCACATGAATCAGAAGGCAGCTCTCTGTTCTGATTTTAGGTTATACCCAGAGTATGG 1200 

Qy 1201 AAAAAAT AAG GC AT GAGAAAG C ATT GAC AT CTT CACT T AAG AAC T GAAC AAAAGAGAAC A 12 60 

I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I 

Db 1201 AAAAAAT AAGGCATGAGAAAGCATTGACATCTTCACTTAAGAACTGAACAAAAGAGAACA 12 60 

Qy 1261 AAT AT T GT CAAT GT T T GGAC AC T T AGGAT CT GAAAT CT T G GAAAT TT T AAGACC T CT T T T 1320 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 AAT AT T GT C AAT GT T T GGAC AC T T AGGAT CT GAAAT CTT GGAAAT TT T AAGACC T CT T T T 1320 



Qy 1321 T CTATCAGT GTAAAAGGAATACAAGATAGCTAGTTGCAAAT GCTGAAT GCATTTCATCAT 13 8 0 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1321 T CT AT CAGT GT AAAAG GAAT AC AAGAT AGCT AGT T GCAAAT G C T GAAT G C AT TT CAT CAT 138 0 

Qy 13 81 T GGT C AG GT C GAT AAG C GT GT T T CT GAAAT AGT C TT AT T T T TAT T C T T GT AAT AT TAAAA 144 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 13 81 TGGTCAGGTCGATAAGCGTGTTTCTGAAATAGTCTTATTTTTATTCTTGTAATATTAAAA 1440 

Qy 1441 T T TAT GT GAAAAAT GAAT AT AAT T C AAT GT ACAACAT T AGAT T T T CT AT T T GAAAAT TAT 1500 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I M I II I I I I I I I I I I 
Db 1441 T T TAT GT GAAAAAT GAAT AT AAT T CAAT GT ACAACAT T AGAT T T T CT AT TT GAAAAT TAT 1500 

Qy 1501 AT T T CT T GAAAAAAT AAC T GC T GT G CC T AAAT AAAT CAAT AT A 1543 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I 
Db 1501 AT T T CT T GAAAAAAT AAC T GC T GT GCCT AAAT AAAT CAAT AT A 1543 



RESULT 2 

US-09-875-076-35 

; Sequence 35, Application US/09875076 

; Publication No. US20030017528A1 

; GENERAL INFORMATION: 

; APPLICANT: Chen, Ruoping 

; APPLICANT: Dang, Huong T. 

; APPLICANT: Liaw, Chen W. 

; APPLICANT: Lin, I-Lin 

; TITLE OF INVENTION: Human Orphan G Protein Coupled Receptors 

FILE REFERENCE: AREN0050 
; CURRENT APPLICATION NUMBER: US/09/875, 076 
; CURRENT FILING DATE: 2001-06-06 
; PRIOR APPLICATION NUMBER: 09/417,044 

PRIOR FILING DATE: 1999-10-12 
; PRIOR APPLICATION NUMBER: 60/120,416 
; PRIOR FILING DATE: 1999-02-16 
; PRIOR APPLICATION NUMBER: 60/121,851 
; PRIOR FILING DATE: 1999-02-26 
; PRIOR APPLICATION NUMBER: 60/123,946 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/123,949 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/136,436 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,437 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,439 

PRIOR FILING DATE: 1999-05-28 

PRIOR APPLICATION NUMBER: 60/136,567 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/137,127 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/137,131 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/141,448 
; PRIOR FILING DATE: 1999-06-29 
; PRIOR APPLICATION NUMBER: 60/156,653 
; PRIOR FILING DATE: 1999-09-29 



PRIOR APPLICATION NUMBER: 60/156,633 
PRIOR FILING DATE: 1999-09-29 
PRIOR APPLICATION NUMBER: 60/156,555 
PRIOR FILING DATE: 1999-09-29 
PRIOR APPLICATION NUMBER: 60/156,634 
PRIOR FILING DATE: 1999-09-29 
PRIOR APPLICATION NUMBER: 60/157,280 
PRIOR FILING DATE: 1999-10-01 
PRIOR APPLICATION NUMBER: 60/157,294 
PRIOR FILING DATE: 1999-10-01 
PRIOR APPLICATION NUMBER: 60/157,281 
PRIOR FILING DATE: 1999-10-01 
PRIOR APPLICATION NUMBER: 60/157,293 
PRIOR FILING DATE: 1999-10-01 
PRIOR APPLICATION NUMBER: 60/157,282 
PRIOR FILING DATE: 1999-10-01 
NUMBER OF SEQ ID NOS : 74 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 35 
LENGTH: 1005 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-09-875-076-35 

Query Match 38.4%; Score 592.4; DB 13; Length 1005; 

Best Local Similarity 75.5%; Pred. No. 2.7e-139; 

Matches 750; Conservative 0; Mismatches 241; Indels 3; Gaps 1; 

Qy 39 GCAGAAT GGCACAGAAT T TAT CT T GT GAGAAT T GGT T GG CAAC AGAGGCT AT CT T GAAT A 98 

II I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I M Mill 
Db 8 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

Qy 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II 

Db 68 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 127 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

III II I I I I I I I I I I II I I I I I I I I I I 1 I I I I I I I I I I III I I I I I I I 
Db 12 8 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 187 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 18 8 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 247 

Qy 27 9 AT GC CAAT GAT AAG G G GAC CT AT GGAGAT GT T CT CT GT ATAAGCAAC C GAT AT GT GCT T C 338 

I I I II I I I I II III II I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I II 
Db 248 AT GC C AAT GGAAACT GGAT AT AT G G AGAC GT G C T CT GC AT AAGC AAC C GAT AT GT GCT T C 307 

Qy 339 ACAC CAACC T CT AC AC CAGC AT CCTCTTCCT C AC T T T CAT T AGCAT G GAC C GAT AT CT GC 398 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I M II II 
Db 3 08 AT G C CAAC CT CT AT AC CAG CAT TCTCTTTCT C ACTT TT AT C AG C AT AGAT C GAT ACT T GA 367 

Qy 399 T CAT GAAGT AC C CT T T CC GAGAACACT T T CT ACAAAAGAAGGAAT T T GC C AT TT T AAT C T 458 

I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 368 T AAT T AAGT AT C CT T T CC GAGAAC ACCT T CT G CAAAAGAAAGAGT T T G CT AT TT T AAT CT 427 



Qy 



459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 



Db 428 C CT T GGC C AT T T G G GT T T T AGTAAC CT T AGAGT TACT AC C CAT ACT T C C C C T T AT AAAT C 4 87 

Qy 519 CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 578 

I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 4 88 CT GT T ATAAC T GACAAT GG CAC C AC CT GTAAT GATT T T GCAAGTT C T GG AG AC C C CAACT 54 7 

Qy 57 9 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I I I I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 54 8 ACTUVCCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 607 

Qy 639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 

I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I II I I I I I I I I I III 

Db 608 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 667 

Qy 699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I I I I I I I II I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I 

Db 668 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 727 

Qy 759 T AC T CT T CAC AC C C TAT CAT AT C AT GC GCAAT T T GAGGAT C G C C T CAC GC C T GGAT AGT T 818 

I II II I I I I I I I I II I MINI! Ill I I I I I I I I I I I I I I I II I II I I I I 
Db 728 T GC T T T T T AC AC C C TAT CAC GT CAT GC GGAAT GT GAG GAT C GC T T CAC GC CT GGG GAGT T 78 7 

Qy 819 G G C C ACAAGGAT GT ACAC AGAAGG C CAT CAAAT C T AT AT ACACACT GAC AC GGC CT C 875 

I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 788 GGAAG C AGT AT CAGT G CACT C AGGT C GT CAT CAACT C CT T T T ACATT GT GAC AC GGC C T T 847 

Qy 876 T GGC C T T T CT GAAC AGT GC CAT CAAT C C CAT CT T CT ACT T C CT CAT G GG AGAC CAT T AC A 935 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I II I I I I I I I II I II 
Db 848 T GG C C T T T CT GAAC AGT GT CAT CAAC C C T GT CT T CT AT TTTCTTTT GGGAGAT C ACTT C A 907 

Qy 936 GAGAGAT GCT GAT T AGT AAGT T C AGACAAT ACT T CAAGT C C CT TAC AT C CT T C AGGAC AT 995 

I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I II I I I I I I I 
Db 908 GGGAC AT GCT GAT GAAT CAACT GAGACACAACT T CAAAT C C CT T AC AT C CT T T AGCAGAT 967 

Qy 996 GAG CT GCT GGAT GCAGGTCTT CAC TCAGCCAAAA 102 9 

I I I I III I I I I I I I I I I I I I 

Db 968 GGG CT CAT GAACT C CT ACTT T CAT T C AGAGAAAA 1001 



RESULT 3 

US-09-876-252-37 

Sequence 37, Application US/09876252 
Publication No. US20030018182A1 
GENERAL INFORMATION: 
APPLICANT: Behan, Dominic P. 

Lehmann-Bruinsma, Karin 
Chalmers, Derek T. 
Lowitz, Kevin P. 
Lin, I-Lin 
Dang, Huong T. 
Chen, Ruoping 
Liaw, Chen W. 

Non-Endogenous Constitively Activated Human G Protein 



APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
TITLE OF INVENTION: 
Coupled Receptors 
; FILE REFERENCE: AREN-0054 
; CURRENT APPLICATION NUMBER: 



US/09/876, 252 



CURRENT FILING DATE: 2001-06-07 

PRIOR APPLICATION NUMBER: 09/416,760 

PRIOR FILING DATE: 1999-10-12 

PRIOR APPLICATION NUMBER: 09/170,496 

PRIOR FILING DATE: 1998-10-13 

PRIOR APPLICATION NUMBER: 60/110,060 

PRIOR FILING DATE: 1998-11-27 

PRIOR APPLICATION NUMBER: 60/120,416 

PRIOR FILING DATE: 1999-02-16 

PRIOR APPLICATION NUMBER: 60/121,852 

PRIOR FILING DATE: 1999-02-26 

PRIOR APPLICATION NUMBER: 60/109,213 

PRIOR FILING DATE: 1998-11-20 

PRIOR APPLICATION NUMBER: 60/123,944 

PRIOR FILING DATE: 1999-03-12 

PRIOR APPLICATION NUMBER: 60/123,945 

PRIOR FILING DATE: 1999-03-12 

PRIOR APPLICATION NUMBER: 60/123,948 

PRIOR FILING DATE: 1999-03-12 

PRIOR APPLICATION NUMBER: 60/123,951 

PRIOR FILING DATE: 1999-03-12 

PRIOR APPLICATION NUMBER: 60/123,946 

PRIOR FILING DATE: 1999-03-12 

PRIOR APPLICATION NUMBER: 60/123,949 

PRIOR FILING DATE: 1999-03-12 

PRIOR APPLICATION NUMBER: 60/152,524 

PRIOR FILING DATE: 1999-09-03 

PRIOR APPLICATION NUMBER: 60/151,114 

PRIOR FILING DATE: 1999-08-27 

PRIOR APPLICATION NUMBER: 60/108,029 

PRIOR FILING DATE: 1998-11-12 

PRIOR APPLICATION NUMBER: 60/136,436 

PRIOR FILING DATE: 1999-05-28 

PRIOR APPLICATION NUMBER: 60/136,439 

PRIOR FILING DATE: 1999-05-28 

PRIOR APPLICATION NUMBER: 60/136,567 

PRIOR FILING DATE: 1999-05-28 

PRIOR APPLICATION NUMBER : 60/137,127 

PRIOR FILING DATE: 1999-05-28 

PRIOR APPLICATION NUMBER: 60/137,131 

PRIOR FILING DATE: 1999-05-28 

PRIOR APPLICATION NUMBER: 60/141,448 

PRIOR FILING DATE: 1999-06-29 

PRIOR APPLICATION NUMBER: 60/136,437 

PRIOR FILING DATE: 1999-05-28 

PRIOR APPLICATION NUMBER: 60/156,555 

PRIOR FILING DATE: 1999-09-29 

PRIOR APPLICATION NUMBER: 60/156,634 

PRIOR FILING DATE: 1999-09-29 

PRIOR APPLICATION NUMBER: 60/156,653 

PRIOR FILING DATE: 1999-09-29 

PRIOR APPLICATION NUMBER: 60/157,280 

PRIOR FILING DATE: 1999-10-01 

PRIOR APPLICATION NUMBER: 60/157,294 

PRIOR FILING DATE: 1999-10-01 

PRIOR APPLICATION NUMBER: 60/157,281 

PRIOR FILING DATE: 1999-10-01 



; PRIOR APPLICATION NUMBER: 60/157,282 
; PRIOR FILING DATE: 1999-10-01 

PRIOR APPLICATION NUMBER: 60/156,633 
; PRIOR FILING DATE: 1999-09-29 
; NUMBER OF SEQ ID NOS : 14 6 
; SOFTWARE: Patentln version 3.0 
; SEQ ID NO 37 

LENGTH: 1005 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-09-876-252-37 

Query Match 38.4%; Score 592.4; DB 13; Length 1005; 

Best Local Similarity 75.5%; Pred. No. 2.7e-139; 

Matches 750; Conservative 0; Mismatches 241; Indels 3; Gaps 1; 

Qy 39 GCAGAATGGCACAGAATTTAT CTT GT GAGAATT GGTT GGCAACAGAGGCT AT CTT GAATA 98 

II I I I I I I I II I I I I I I I I I I II I I I II I I I I I I I I I I I I I 
Db 8 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

Qy 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 

Db 68 AGT ACT AC CT T T C CAT T T T T TAT GG GAT T GAGT T C GT T GT GGGAGT C CTT G GAAAT AC C A 127 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I III I Mill I 
Db 128 T T GT TGTT T AC GGC T AC AT CTTCTCTCT GAAGAACT GGAACAGC AGT AAT ATT T AT CT CT 187 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I 
Db 188 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 247 

Qy 279 AT GC CAAT GAT AAG GGGAC CT AT G GAGAT GT T CT C T GT AT AAG CAAC C GAT AT GT GCT T C 338 

I I I I I I I I I II III I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 248 AT GC CAAT G GAAACT GGAT AT AT GGAGAC GT GC T CT GC AT AAGCAAC C GAT AT GT GCTT C 307 

Qy 339 ACAC CAAC CT CT AC AC C AGC AT CCTCTTCCT C ACTTT C AT TAG CAT GGAC C GAT AT CT GC 398 



Db 



308 AT GC CAACCT CT ATACCAGCATT CT CTTT CT CACTTTTAT CAGCATAGAT CGATACTT GA 367 



Qy 



399 T CAT GAAGT AC C CT TT C CGAGAACAC T T T CT ACAAAAGAAGGAAT T T GC CAT T T T AAT CT 458 




Db 



368 TAATTAAGTATCCTTTC CGAGAACAC CTT CTGCAAAAGAAAGAGTTT GCT ATTTTAATCT 427 



Qy 



4 59 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 



Db 



42 8 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 487 



Qy 



519 C T GT CC CAAAAGAAGAG G GC AGT AAC T G CAT C GAC TAT GCAAGT T CT G GAAAC C CT GAAC 57 8 



Db 



4 88 CT GT TAT AAC T GAC AAT G GCAC C AC C T GT AAT GAT TT T GCAAGT T CT GGAGAC C C CAACT 547 



Qy 



579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 



Db 



54 8 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 607 



Qy 



639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 



Db 608 TGTGTTTCTTT TAT T AC AAGAT TGCTCTCTTCC T AAAGC AGAG GAAT AG G C AG GT T G C T A 667 

Qy 699 CTGCCCTGCCACTGGAC7WVCCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 668 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 72 7 

Qy 759 T AC T CTT C AC AC C C TAT CAT AT CAT G C GCAAT TT GAGGAT C GC CT CAC GC CT GGATAGT T 818 

I II II I I M I I I I II I I I I I I I I Ml I II I I I I I I I I I I I I I I I I I I I I I 
Db 728 T GC T T TT TAC AC C CT AT C AC GT CAT G C G GAAT GT GAG GAT C GCT T CAC G C CT G GGGAGT T 7 87 

Qy 819 G G CCAC AAGGAT GTAC AC AGAAGGC CAT CAAAT CT AT AT AC ACACT GAC AC G GC CT C 875 

I II I I I I I M I I I I I I II I I I I I I I I I I I I I I I I II 

Db 7 88 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 847 

Qy 876 TGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGGAGACCATTACA 935 

II I II I I I I I I I II I I I I I I II I I II I I I II I I II II I I I I I I I II I II 

Db 848 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 907 

Qy 936 GAGAGAT GCT GATT AGTAAGTT C AGACAATACTTCAAGT C CCTTACAT CCTT CAGGACAT 995 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M 
Db 908 GGGACATGCT GAT GAATCAACT GAGACACAACTT CAAAT CCCTTACAT CCTTTAGCAGAT 967 

Qy 996 GAGCTGCTGGATGCAGGTCTTCACTCAGCCAAAA 102 9 

I I I I III I I I I I I I I I Mil 

Db 968 GG G C T CAT GAACT C CTAC T T T CATT CAGAGAAAA 1001 



RESULT 4 

US-10-272-983-35 

; Sequence 35, Application US/10272983 

; Publication No. US20030148450A1 

; GENERAL INFORMATION: 

; APPLICANT: Chen, Ruoping 

; APPLICANT: Dang, Huong T. 

; APPLICANT: Liaw, Chen W. 

; APPLICANT: Lin, I-Lin 

; TITLE OF INVENTION: Human Orphan G Protein Coupled Receptors 
; FILE REFERENCE: AREN0050 

; CURRENT APPLICATION NUMBER: US/ 10/272 , 98 3 

; CURRENT FILING DATE: 2002-10-17 

; PRIOR APPLICATION NUMBER: US/ 09/ 4 17 , 04 4 

; PRIOR FILING DATE: 1999-10-12 

; PRIOR APPLICATION NUMBER: 60/109,213 

; PRIOR FILING DATE: 1998-11-20 

PRIOR APPLICATION NUMBER: 60/120,416 
; PRIOR FILING DATE: 1999-02-16 
; PRIOR APPLICATION NUMBER: 60/121,851 
; PRIOR FILING DATE: 1999-02-26 
; PRIOR APPLICATION NUMBER: 60/123,946 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/123,949 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/136,436 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,437 
; PRIOR FILING DATE: 1999-05-28 



; PRIOR APPLICATION NUMBER: 60/136,439 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,567 
; PRIOR FILING DATE: 1999-05-28 

; Remaining Prior Application data removed - See File Wrapper or PALM. 

; NUMBER OF SEQ ID NOS : 74 

; SOFTWARE: PatentlnVer. 2.1 

; SEQ ID NO 35 

LENGTH: 1005 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-272-983-35 

Query Match 38.4%; Score 592.4; DB 15; Length 1005; 

Best Local Similarity 75.5%; Pred. No. 2.7e-139; 

Matches 750; Conservative 0; Mismatches 241; Indels 3; Gaps 1; 

Qy 39 GCAGAAT GGC ACAGAAT T TAT CTT GT GAGAAT T GGTT GGCAACAGAGGCTAT CTT GAAT A 98 

II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I 
Db 8 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

Qy 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II 

Db 68 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 127 

Qy 15 9 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

III II I I I II I I I I I I I I I I II I I I I I I M I I I I I II I Ml I I I I I I I 
Db 128 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 187 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 1 II M I I I I I I I I 
Db 18 8 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 247 

Qy 27 9 AT GC CAAT GATAAGGGGACCTAT GGAGAT GTT CT CTGTATAAGCAAC CGATAT GTGCTT C 338 

I I I I I I I I I II III I I I I I I I I II I I I I I I I I I I I II I I M I I I II I I I II 
Db 248 AT GC CAAT GGAAAC T GGAT AT AT G GAG AC GT GCT CT GC AT AAGCAAC CGATAT GT GC T T C 307 

Qy 339 ACAC CAAC C T CT ACAC C AGC AT C CT CT T C CT C ACTTT C ATTAG CAT GGAC C GAT AT CT GC 398 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I II I I I I I II 
Db 308 AT G C CAAC CT CT AT ACC AGC AT T CT CT T T CT C ACTT T TAT CAGC AT AGAT C GAT ACT T GA 367 

Qy 399 T CAT GAAGT AC C CT TT C C GAGAACACT T T CT ACAAAAGAAG GAAT TT GC C AT TTT AAT CT 458 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II II I II I I I II I I I 
Db 368 T AAT T AAGT AT C C T T T C C GAGAAC AC CTT CT GCAAAAGAAAGAGT T T GCT ATTT T AAT CT 427 

Qy 459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 428 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 487 

Qy 519 CT GT C CCAAAAGAAGAGGG C AGT AAC T G CAT C GACT AT G CAAGT T CT GGAAACC CT GAAC 578 

I I I I II II I I I I I I I I I I I I I I I I I I | | I I I II I I I I I I 
Db 4 88 CT GT TATAACT GACAAT GGCAC CAC CT GTAAT GATT TTGCAAGTT CT GGAGACCC CAACT 547 

Qy 57 9 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I I I I I I I I I I I I I I I I II II II I I I II I I I I I I I I I I I I I I I I II I I I 
Db 54 8 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 607 



Qy 639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTT7\AAGAGGAGGAGCCAGCAGCAAGCAA 698 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 608 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 667 

Qy 699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I I I I I I I I I II II I I II I I I I I I II I I I I II II I I I I I I I I I 

Db 668 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 727 

Qy 759 TACTCTTCACACCCTATCATATCATGCGCAATTTGAGGATCGCCTCACGCCTGGATAGTT 818 

I II II I I I I I I II I I I I I I I II I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 72 8 TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 7 87 

Qy 819 G — - GC C AC AAG GAT GT ACAC AGAAG G C CAT CAAAT CT AT AT AC AC AC T GAC AC GG C C T C 875 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 788 GGAAGC AGT AT CAGT GC ACT CAGGT C GT CAT CAACT CCTTTT AC ATT GT GACAC GGCCTT 847 

Qy 876 TGGCCTTTCTGAACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGGAGACCATTACA 935 

I I I I I I I I I I I I I I I I I! I I I I I I II I I I I I I I II II I I I I I I I II I II 
Db 848 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 907 

Qy 936 GAGAGAT GCT GAT T AGT AAGT T C AGACAAT AC T T CAAGT C C CT T AC AT C CT T C AG GAC AT 995 

I I I I II I I I I I I I I I I I II I II I I II I I I I I I I I I I I I I I I I I I I I 
Db 908 GGGACAT GCT GAT GAAT CAAC T GAGAC ACAAC T T CAAAT C C CT T AC AT C C T T T AGCAGAT 967 

Qy 996 GAGCTGCTGGATGCAGGTCTTCACTCAGCCAAAA 1029 

I I I I III I I I I I I I I I MM 

Db 968 GGGC T CAT GAACT CCT ACT T T CAT T C AGAGAAAA 1001 



RESULT 5 

US-10-393-807-35 

; Sequence 35, Application US/10393807 

; Publication No. US20030175891A1 

; GENERAL INFORMATION: 

; APPLICANT: Chen, Ruoping 

; APPLICANT: Dang, Huong T. 

; APPLICANT: Liaw, Chen W. 

; APPLICANT: Lin, I-Lin 

; TITLE OF INVENTION: Human Orphan G Protein Coupled Receptors 
; FILE REFERENCE: AREN0050 

; CURRENT APPLICATION NUMBER: US/ 10/393 , 8 07 
; CURRENT FILING DATE: 2003-03-21 
; PRIOR APPLICATION NUMBER: US/ 09/417, 044 
; PRIOR FILING DATE: 1999-10-12 

PRIOR APPLICATION NUMBER: 60/109,213 
; PRIOR FILING DATE: 1998-11-20 
; PRIOR APPLICATION NUMBER: 60/120,416 

PRIOR FILING DATE: 1999-02-16 
; PRIOR APPLICATION NUMBER: 60/121,851 
; PRIOR FILING DATE: 1999-02-26 
; PRIOR APPLICATION NUMBER: 60/123,946 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/123,949 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/136,436 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,437 



; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,439 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,567 
; PRIOR FILING DATE: 1999-05-28 
; Remaining Prior Application data removed 
; NUMBER OF SEQ ID NOS : 74 
; SOFTWARE: PatentlnVer. 2.1 
; SEQ ID NO 35 

LENGTH: 1005 

TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-393-807-35 

Query Match 38.4%; Score 592.4; DB 15; Length 1005; 

Best Local Similarity 75.5%; Pred. No. 2.7e-139; 

Matches 750; Conservative 0; Mismatches 241; Indels 3; Gaps 1; 

Qy 39 GC AGAAT GG CACAGAAT T TAT CT T GT GAGAAT T GGT T GG CAAC AGAGGCT AT C T T GAAT A 9 8 

II M I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | M I II I I 
Db 8 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

Qy 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I I I I M I I I I I I I I I I I I I I I I I I I I II I I II I I I I II I I I II 

Db 68 AGT ACT AC C T T T C CAT T T T T TAT GG GAT T GAGT T C GT T GT GGGAGT C CT T G GAAATAC CA 127 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

M I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 12 8 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 187 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | Mill 
Db 188 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 247 

Qy 279 AT GC CAAT GAT AAG G GGAC CT AT GGAGAT GT TC T CT GT AT AAGCAAC C GAT AT GT GCT T C 338 

I IN II III I I I I I I I I II I I I I I | | | | | | | | | M | | M | | | | | | | 

Db 24 8 AT G C CAAT GGAAACT GGAT AT AT GGAGAC GT GCT CT GCAT AAGCAAC C GAT AT GT GCT T C 307 

Qy 339 AC AC CAAC CT CT AC AC CAG CAT CCTCTTCCT C ACTTT CAT T AGC AT GGAC C GAT AT CT GC 398 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I I II II I I I I I II I II I I II 
Db 308 AT G C CAAC CT C TAT AC CAGCAT TCTCTTTCT CACT TT TAT CAG C AT AGAT C GAT ACT T GA 3 67 

Qy 399 T CAT GAAGT AC C CT T T C C GAGAAC ACT T T CT ACAAAAGAAGGAAT T T GC CAT T T TAAT CT 458 

I M Mill I I I I I I I I I I I I II I I I I I I I I I I I II II Mill I I II II I I I I 
Db 368 TAAT T AAGT AT C CT T T C C GAGAAC AC C T T C T GCAAAAGAAAGAGT T T GC T AT T T TAAT CT 427 

Qy 459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I M I M II I I II I II I II I I I I I II I II I I M I I I I I II 
Db 42 8 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 4 87 

Qy 519 C T GT C C CAAAAGAAG AG GGCAGT AAC T G CAT C GAC TAT G C AAGTT CT G GAAAC C C T GAAC 578 

MM II II I I II I I I II I I I I I I I I I II II I | | | M II I 

Db 4 88 CTGTTATAACTGACAATGGCACCACCTGTAATGATTTTGCAAGTTCTGGAGACCCCAACT 547 

Qy 57 9 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I M I I I I I I M II I I I II II II I I II I I I II I I I II I II I I I I II II I 
Db 548 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 607 



- See File Wrapper or PALM. 



Qy 


639 


Db 


608 


Qy 


699 


Db 


668 


Qy 


759 


Db 


728 


Qy 


819 


Db 


788 


Qy 


876 


Db 


848 


Qy 


936 


Db 


908 


Qy 


996 


Db 


968 



TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 69 8 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I 

TGTGTTTCTTT TAT T ACAAGAT TGCTCTCTTCC T AAAG CAGAG GAATAGG C AG GTT G C T A 667 

CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 
I I I I I I I M I I ' I I I I I I I I II I I I I I I I I I I I I I I I I I | | I | 

CTGCTCTGCCCCTTGAAAAGCCTCTC7VACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 727 

T AC T C T T C ACAC C C TAT CAT AT CAT GC GCAAT T T G AGGAT C GC C T C AC GC C T GGAT AGT T 818 
I II II I I I I I I I I I I I I II I I I I III MINIMI I I I I I I I I I I | | | | 
T G CT T T T T AC AC C CT AT CAC GT CAT GC GGAAT GT GAG GAT C G C T T C AC GC CT GGGGAGT T 787 

G- — GC C ACAAG GAT GT AC AC AGAAGG C CAT CAAAT C T AT AT AC AC ACT GACAC GGC C T C 875 
I M I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I I 

GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 847 

TGGCCTTTCT GAAC AGT G C CAT CAAT C C CAT C T T CT ACT T C CT CAT G G GAGACC AT T AC A 935 
I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I II II I I I II II II I II 
TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 907 

GAGAGATGCTGATTAGTAAGTTCAGACAATACTTCAAGTCCCTTACATCCTTCAGGACAT 995 
I I I I I I I I I I I I I I I II I I I I I I I I I I I || I I I I I I I I || | | | | | | 
GGGACAT GCT GAT GAAT CAACT GAGACACAACTT CAAAT CC CTTACATCCTTTAGCAGAT 967 



I Ml III I I II I I I I I I I I I 



RESULT 6 

US-10-417-820A-37 

; Sequence 37, Application US/10417820A 

; Publication No. US20030229216A1 

; GENERAL INFORMATION: 

; APPLICANT: Chen, Ruoping 

; APPLICANT: Liaw, Chen W. 

; APPLICANT: Lowitz, Kevin 

; APPLICANT: Chalmers, Derek T. 

; APPLICANT: Behan, Dominic P. 

; TITLE OF INVENTION: Cons titutively Activated Human G Protein Coupled 
; TITLE OF INVENTION: Receptors 
; FILE REFERENCE: 7.US2 8.CON 

; CURRENT APPLICATION NUMBER: US/10/417, 820A 
; CURRENT FILING DATE: 2003-04-16 

PRIOR APPLICATION NUMBER: 09/416,760 

PRIOR FILING DATE: 1999-10-12 
; PRIOR APPLICATION NUMBER: 09/170,496 
; PRIOR FILING DATE: 1998-10-13 
; PRIOR APPLICATION NUMBER: 60/110,060 
; PRIOR FILING DATE: 1998-11-27 
; PRIOR APPLICATION NUMBER: 60/120,416 
; PRIOR FILING DATE: 1999-02-16 
; PRIOR APPLICATION NUMBER: 60/121,852 
; PRIOR FILING DATE: 1999-02-26 
; PRIOR APPLICATION NUMBER: 60/109,213 
; PRIOR FILING DATE: 1998-11-20 



; PRIOR APPLICATION NUMBER: 60/123,944 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/123,945 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/123,948 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/123,951 
; PRIOR FILING DATE: 1999-03-12 

Remaining Prior Application data removed 
; NUMBER OF SEQ ID NOS : 155 
; SOFTWARE : Patentln version 3.2 
; SEQ ID NO 37 

LENGTH: 1005 
; TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-417-820A-37 

Query Match 38.4%; Score 592.4; DB 16; Length 1005; 

Best Local Similarity 75.5%; Pred. No. 2.7e-139; 

Matches 750; Conservative 0; Mismatches 241; Indels 3; Gaps 1; 

Qy 39 GCAGAAT GGCACAGAAT TTAT CTT GT GAGAATT GGTT GGCAACAGAGGCTATCTT GAAT A 98 

II I I M M I I I I I I I I I I I I I I I I I I I I M I I I I II I II I I 
Db 8 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

QY 9 9 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I | | | | | | | | || 

Db 68 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 127 

Qy 159 CT GT GGT GTT C GGC T AC CTCTTCTG CAT GAAGAACTGGAAC AG C AGCAAT GT CT AT C T T T 218 

Ml II I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | Ml | Mill I 
Db 128 T T GT TGTT T AC GGC T AC AT CTTCTCTCT GAAGAACT G GAAC AGC AGTAAT AT T TAT CT C T 187 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 27 8 

I I I I I I I II II I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I II I 
Db 18 8 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 2 47 

Qy 279 AT GC CAAT GATAAGGGGAC C TAT G GAGAT GT T CT CT GT AT AAGCAAC C GAT AT GT GCT T C 338 

I M I I I I I I II III I I I I I I I | | | | | | M M | | | | | M | | | | | | || || | | | 
Db 24 8 AT GC CAAT GGAAACT G GAT AT AT GGAGAC GT G CT CT GCATAAG CAAC C GAT AT GT GC TT C 307 

Qy 339 AC AC CAAC CT CT AC AC C AGC AT CCTCTTCCT C AC T T T CATT AG CAT GGAC C GAT AT CT G C 398 

I I I M I I I I II I I I I I I I I I I I II I I I II I I I II I I I I I || | | | | | | | 
Db 308 AT G C CAAC CT CT AT AC C AGC AT TCTCTTTCT C ACT T T TAT C AGCAT AGAT C GAT ACTT GA 367 

Qy 399 T CAT GAAGT AC C CT TT C C GAGAAC ACT T T CTAC AAAAGAAGGAAT T T GC C AT T TT AAT CT 4 58 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I | I | I I I I I I I I I I I I 
Db 368 T AAT TAAGT AT C CT TT C C GAGAAC AC C T T CT GCAAAAGAAAGAGT TT GCT AT T T TAAT CT 427 

Qy 4 59 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I I I I I I II I I I I II I M I I I I I I I I M I | I | ! || | | | 
Db 42 8 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGT TACT ACC CAT ACTTCCCCTTATAAATC 4 87 

Qy 519 CT GT C C CAAAAGAAGAGG G C AGT AACT G CAT C GAC TAT G CAAGT T CT G GAAAC C CT GAAC 578 

I I I I II M M I I I I I I I II II I I I I I I I 

Db 488 C T GT T AT AAC T GACAAT GGC AC C AC C T GTAAT GAT T T T GCAAGT T CT GGAGAC C C CAACT 547 



- See File Wrapper or PALM. 



Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTT^TTCCTCTCTCTGTGA 
INI 1 1 1 1 1 1 1 II 1 1 1 1 II II I 1 1 1 1 II ! 1 | | M | | | M 1 II 1 1 1 1 II 1 
ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


548 


607 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 

N 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 I | I I | I I I I M 1 1 1 1 1 1 M 1 1 Ml 

T GT GT T T C T T T TAT T ACAAGAT TGCTCTCTTCC T AAAG C AGAG GAAT AGG C AG GT T GCT A 


698 


Db 


608 


667 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

N 1 1 1 1 N 1 1 1 1 1 || | | | | || || | 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


668 


727 


Qy 


759 


TACT CT T CACAC C CTATCATAT CAT GCGCAAT TT GAGGAT CGCCT CAC GC CT GGATAGTT 

1 N N 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 Ml 1 II 1 1 II 1 1 1 II 1 M II 1 1 1 MM 

TGCTTTTTACACCCTATCACGT CAT GCGGAATGT GAGGAT CGCTTCACGCCTGGGGAGTT 


818 


Db 


728 


787 


Qy 


819 


G G C C ACAAGGAT GT AC ACAGAAGG C CAT CAAAT C TAT AT AC AC ACT GAC AC GGC C T C 

1 N 1 II M III 1 1 1 1 M 1 1 1 1 1 1 II | || || II II 1 1 
G GAAG C AGT AT C AGT G CAC T CAGGT CGT CAT CAACT C CT T T TAC AT T GT GAC AC GGC CT T 


875 


Db 


788 


847 


Qy 


876 


TGGCCTTTCT GAACAGT GC C AT CAAT C C CAT C T T CT ACT T C C T CAT G GGAGAC CAT T AC A 

N N 1 II 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 II 1 1 1 II II 1 M II II M 1 1 1 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


848 


907 


Qy 


936 


GAGAGAT GCT GAT T AGTAAGT T C AGACAAT ACTT CAAGT C C C T TAC AT C CT T C AGGACAT 

t II l l l I I I I I I I I i i i t i i i i i i i i i i i i i I i i i i i i i i i ii i ii 
' II 1 1 i 1 1 1 1 1 1 1 1 1 Mill M I | | || 1 1 II II 1 II II II 1 M 1 II 

GG GAC AT G CT GAT GAAT CAAC T GAGAC AC AACT T CAAAT C C CT T ACAT C CT T T AGCAGAT 


995 


Db 


908 


967 


Qy 


996 


GAG CT GCT GGAT GCAG GT CT T CAC T C AG C CAAAA 102 9 

MM Ml 1 1 1 II II 1 1 1 1 1 1 

GGG CT CAT GAACT C CT ACTT T CAT T C AGAGAAAA 1001 




Db 


968 





RESULT 7 

US-10-723-955-37 

Sequence 37, Application US/10723955 
Publication No. US20040110238A1 
GENERAL INFORMATION: 
APPLICANT: Behan, Dominic P. 
APPLICANT: Chalmers, Derek T. 
APPLICANT: Lin, I-Lin 
APPLICANT: Liaw, Chen W. 
APPLICANT: Lehman- Bruinsma, Karin 
APPLICANT: Lowitz, Kevin P. 
APPLICANT: Dang, Huong T. 
APPLICANT: Chen, Ruoping 
APPLICANT: Gore, Martin 
APPLICANT: White, Carol 

TITLE OF INVENTION: Cons titutively Activated Human G Protein Coupled 
TITLE OF INVENTION: Receptors 
FILE REFERENCE: 7.US29.CON 

CURRENT APPLICATION NUMBER: US/10/723,955 
CURRENT FILING DATE: 2003-11-26 
PRIOR APPLICATION NUMBER: 10/417,820 
PRIOR FILING DATE: 2003-4-16 
PRIOR APPLICATION NUMBER: 09/416,760 
PRIOR FILING DATE: 1999-10-12 



; PRIOR APPLICATION NUMBER: 09/170,4 96 

; PRIOR FILING DATE: 1998-10-13 

; PRIOR APPLICATION NUMBER: 60/110,060 

; PRIOR FILING DATE: 1998-11-27 

; PRIOR APPLICATION NUMBER: 60/120,416 

; PRIOR FILING DATE: 1999-02-16 

; PRIOR APPLICATION NUMBER: 60/121,852 

; PRIOR FILING DATE: 1999-02-26 

; PRIOR APPLICATION NUMBER: 60/109,213 

; PRIOR FILING DATE: 1998-11-20 

PRIOR APPLICATION NUMBER: 60/123,944 

PRIOR FILING DATE: 1999-03-12 

PRIOR APPLICATION NUMBER: 60/123,945 

PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/123,948 
; PRIOR FILING DATE: 1999-03-12 

; Remaining Prior Application data removed - See File Wrapper or PALM. 
; NUMBER OF SEQ ID NOS : 148 
; SOFTWARE: Patentln version 3.2 
; SEQ ID NO 37 
; LENGTH: 1005 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-723-955-37 

Query Match 38.4%; Score 592.4; DB 17; Length 1005; 

Best Local Similarity 75.5%; Pred. No. 2.7e-139; 

Matches 750; Conservative 0; Mismatches 241; Indels 3; Gaps 1; 

Qy 39 GCAGAATGGCACAGAATTTAT CT T GT GAGAATT GGTT GGCAACAGAGGCTAT CTT GAATA 98 

II I I I I M I I I I I I I II I I I I I I I I I I I I M II I I I I I I I I 
Db 8 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

Qy 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

I I I I I I M I! I I II I I I I I I I I I I I II I I I I I I I I I I I I I || || 

Db 68 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 127 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

I I I M I I I I II I I I I I I I I I I I I I I I I I I II I I I I I || I I I I I II I I I 
Db 128 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 187 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 188 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 247 

Qy 279 AT GC CAAT G AT AAG GG GAC CT AT G GAGAT GT T CT CT GT AT AAG CAAC C GAT AT GT GCT T C 338 

I I I i I I M I M Ml I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 24 8 AT GC CAAT GGAAACT GGAT ATAT GG AGAC GT GC T CT G C ATAAGC AAC C GAT AT GT GCT T C 307 

Qy 339 AC AC C AAC CT CT AC AC C AG CAT CCTCTTCCT C ACT T T C ATT AG CAT G GAC C GAT AT CT GC 398 

I I I I M M I I I I I I I I II I I I I I I I I I I I I I I II Mill II I I I I I II 
Db 3 08 AT G C CAAC CT CT AT AC C AG CAT TCTCTTTCT CACT T T TAT C AGC AT AGAT C GAT AC T T GA 367 

Qy 399 T CAT GAAGT AC C CT T T C C GAGAACAC T T T CT ACAAAAGAAGGAAT T T GC CAT T T T AAT C T 458 

I II I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I II I II I I I I I I I I I I II 
Db 368 T AAT T AAGT AT C CT T T C C GAGAACAC CT T CT G CAAAAGAAAGAGT T T G CT AT T T TAAT C T 427 



Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 
1 MM 1 1 M 1 II II 1 1 1 II 1 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 IMIM 
CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 


518 


Db 


428 


487 


Qy 


519 


CT GT C C CAAAAGAAGAG G GCAGT AAC T GCAT C GACT AT G C AAGTT C T G GAAAC C C T GAAC 

MM M M 1 1 1 1 1 Mill 1 1 II 1 1 1 1 1 1 1 1 II 1 1 

CT GT T ATAACT GAC AAT G GCAC C AC C T GTAAT GAT T T T G CAAGT T C T G GAGAC C C CAAC T 


578 


Db 


488 


547 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

MM 1 1 M 1 II M 1 1 1 1 II II II 1 | | | 1 1 1 1 II II II 1 | | | 1 | 1 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


548 


607 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 

M M 1 II II 1 II 1 1 1 1 1 1 II 1 1 II II 1 1 1 1 III | Ml 

TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


608 


667 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 
M M 1 1 1 II 1 1 1 1 1 II II Mill 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 II 
CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


668 


727 


Qy 


759 


TAC T C T T C ACAC C C TAT CAT AT CAT G C GCAAT T T GAGGAT CGCCT C AC G C CT GGAT AGT T 

1 M M 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 I I || | | | | | | | | || | | | | | | | | | | M 1 

T G C T T T TT ACAC C CT AT C AC GT CAT G C GGAAT GT GAGGAT CG CT T C AC GC C T GGG GAGT T 


818 


Db 


728 


787 


Qy 


819 


G GC CACAAGGAT GT AC AC AGAAG GC CAT CAAAT CT AT AT ACAC ACT GAC AC G GC C T C 

1 M 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I | | | || | 
GGAAGC AGT AT CAGT GC ACT C AGGT C GT CAT CAACT CC TT T T ACAT T GT GAC AC G GC C T T 


875 


Db 


788 


847 


Qy 


876 


TGGCCTTTCTG7^ACAGTGCCATCAATCCCATCTTCTACTTCCTCATGGGAGACCATTACA 

Mill 1 1 1 II 1 1 II 1 II 1 1 1 1 1 1 1 M II 1 1 1 1 1 M 1 II II | 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


848 


907 


Qy 


936 


GAGAGAT GC T GAT T AGT AAGT T C AGACAAT ACT T CAAGT C C CT TAC AT C C T T C AGGAC AT 

1 || I 1 1 1 | j 1 1 | 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i i 1 ii i ii 

1 M 1 1 1 1 II 1 1 1 1 1 1 Mill II II 1 1 1 II II II 1 1 1 I I I I I || | | | 
GGGACATGCTGATGAATCAACTGAGACACAACTTCAAATCCCTTACATCCTTTAGCAGAT 


995 


Db 


908 


967 


Qy 


996 


GAGCTGCTGGATGCAGGTCTTCACTCAGCCAAAA 102 9 
M M M 1 1 1 II II 1 1 1 1 1 1 1 
G GG CT CAT GAACT C C T ACT TT CAT T C AGAGAAAA 1001 




Db 


968 





RESULT 8 

US-10-782-596-35 

; Sequence 35, Application US/10782596 

; Publication No. US20040137509A1 

; GENERAL INFORMATION: 

; APPLICANT: Chen, Ruoping 

; APPLICANT: Dang, Huong T. 

; APPLICANT: Liaw, Chen W. 

; APPLICANT: Lin, I-Lin 

; TITLE OF INVENTION: Human Orphan G Protein Coupled Receptors 
; FILE REFERENCE: AREN0050 

; CURRENT APPLICATION NUMBER: US/ 1 0/7 82 , 5 96 

; CURRENT FILING DATE: 2004-02-19 

; PRIOR APPLICATION NUMBER: US/ 0 9/ 8 75 , 07 6 

PRIOR FILING DATE: 2001-06-06 
; PRIOR APPLICATION NUMBER: 09/417,044 



I 



; PRIOR FILING DATE: 1999-10-12 

; PRIOR APPLICATION NUMBER: 60/120,416 

; PRIOR FILING DATE: 1999-02-16 

; PRIOR APPLICATION NUMBER: 60/121,851 

; PRIOR FILING DATE: 1999-02-26 

; PRIOR APPLICATION NUMBER: 60/123,946 

PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/123,949 
; PRIOR FILING DATE: 1999-03-12 
; PRIOR APPLICATION NUMBER: 60/136,436 

PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,437 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,439 
; PRIOR FILING DATE: 1999-05-28 
; PRIOR APPLICATION NUMBER: 60/136,567 
; PRIOR FILING DATE: 1999-05-28 

; Remaining Prior Application data removed - See File Wrapper or PALM. 

; NUMBER OF SEQ ID NOS : 74 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 35 

LENGTH: 1005 

TYPE : DNA 

ORGANISM: Homo sapiens 
US-10-782-596-35 

Query Match 38.4%; Score 592.4; DB 17; Length 1005; 

Best Local Similarity 75.5%; Pred. No. 2.7e-139; 

Matches 750; Conservative 0; Mismatches 241; Indels 3; Gaps . 1; 

Qy 39 GCAGAATGGCACAGAATTTATCTT GT GAGAATT GGTTGGCAACAGAGGCTATCTTGAATA 98 

II I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | | | | 
Db 8 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 67 

Qy 9 9 AGT ACT AC CT CT CT GC AT T T TAT GCAAT C GAGT TC AT TTT T GGAC T G C T T G GGAAT GT C A 158 

I I I M I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II 

Db 68 AGT ACT AC CT T T C CAT T T TT T AT GGG ATT GAGTT C GT T GT G G GAGT C C T T GGAAAT AC CA 127 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

I I I Mill MINI I I I I I I I I I I I I I I I I I I I M I | IMM | 

Db 12 8 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 187 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 278 

I M I I II II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I | I I | | | | | | | | | | 
Db 188 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 247 

Qy 279 ATGCCAATGATAAGGGGACCTATGGAGATGTTCTCTGTATAAGCAACCGATATGTGCTTC 338 

MINIMI II III I I I I I I I I II N II I I II I I N II II II N I I II I I I 

Db 248 AT G C CAAT GGAAACT G GAT AT AT GGAGAC GT GCT C T GC ATAAG CAAC C GAT AT GT G CT T C 307 

Qy 339 AC AC CAAC CT CT AC AC C AGC AT CCTCTTCCT C ACT T T C ATT AGCAT GGAC C GAT AT CT GC 398 

I N N N I I I I I I I || I I I I | I M I II I I I II II I I | | | || | | | | | | | 
Db 308 AT G C CAAC C T CT AT AC C AGCAT TCTCTTTCT C ACT T T TAT CAG CAT AG AT C GAT AC T T GA 367 

Qy 399 T CAT GAAGT ACC CT TT C C GAGAAC AC TTT CT ACAAAAGAAG GAAT TT GC CAT T T T AAT C T 458 

I N I I I II I II I I I Ill M I I I I I I 

Db 368 TAAT T AAGT AT C CT TT C C GAG AAC AC C T T C T GCAAAAGAAAGAGT T T G CT AT T T T AAT CT 427 



Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 
1 MM 1 1 1 1 1 II 1 1 1 | | | | | | M 1 1 1 II 1 1 1 1 | I I | | | | || 
CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 


518 


Db 


428 


487 


Qy 


519 


CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 
MM II II 1 II 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 I 
CT GTT ATAAC T GAC AAT G GC AC C AC C T GTAAT GAT T T T GCAAGT T CT G GAGAC C C CAAC T 


578 


Db 


488 


547 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCT7y\TTCCTCTCTCTGTGA 
MM 1 II 1 M 1 M 1 1 1 1 II II M I I I I I I | | | | | | | | | | | M | | | | | | | 
ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


548 


607 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 

M M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 | 1 1 I 1 1 1 1 I M Mill MM II 1 

TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


608 


667 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

M M 1 1 1 1 1 1 1 1 1 M M | MMI Ml M I I I M M 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


668 


727 


Qy 


759 


T AC T CT T CAC AC C C TAT CAT AT CAT GC GCAAT TT GAGGAT C GC CT C AC G C CT GGATAGT T 

1 M M II 1 1 || 1 II 1 II 1 1 II 1 M 1 II 1 M II 1 1 II 1 MM 

T GCT T T T T AC AC C C TAT CAC GT CAT G C GGAAT GT GAGGAT C GCT T CAC G C CT GG GGAGT T 


818 


Db 


728 


787 


Qy 


819 


G GC C ACAAGGAT GT AC AC AGAAG GC CAT CAAAT CT AT AT AC AC AC T GAC AC G GC C T C 

1 M 1 1 1 II M 1 1 1 1 1 II 1 I I Mill II II 1 II 1 1 1 1 
GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 


875 


Db 


788 


847 


Qy 


876 


TGGCCTTTCT GAAC AGT GC C AT CAAT C C CAT C T T CT ACT T C C T CAT GG GAGAC CAT T AC A 

M M M 1 1 1 II 1 1 M II II 1 II 1 II 1 1 1 1 1 1 II 1 II 1 1 M 1 II 1 II 1 1 1 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


848 


907 


Ov 


936 


(o/va/Vcr.tt.i <au i Kd/\i i /\b 1 i 1 CAbACAATACTT CAAGT CC CTTACAT CCTT CAGGACAT 

II 1 M 1 1 II 1 II II Ml M 1 M II 1 1 M 1 1 II 1 II II M 1 1 

GGGACAT G CT GAT GAAT CAACT GAGAC ACAACT T CAAAT C C C T T AC AT C CT TT AGCAGAT 


995 


Db 


908 


967 


Qy 


996 


GAGCTGCTGGATGCAGGTCTTCACTCAGCCAAAA 1029 
MM M 1 1 1 II II II 1 MM 
GGGCTCATGAACTCCTACTTTCATTCAGAGAAAA 1001 




Db 


968 





RESULT 9 

US-10-225-567A-566 

; Sequence 566, Application US/10225567A 

; Publication No. US20030113798A1 

; GENERAL INFORMATION: 

; APPLICANT: Lifespan Biosciences 

; APPLICANT: Brown, Joseph P. 

; APPLICANT: Burmer, Glenna C. 

; APPLICANT: Roush, Christine L. 

; TITLE OF INVENTION: ANTIGENIC PEPTIDES AND ANTIBODIES FOR G PROTEIN-COUPLED 

RECEPTORS (GPCRS) 

; FILE REFERENCE: 1920-4-4 

; CURRENT APPLICATION NUMBER: US/10/225, 567A 

; CURRENT FILING DATE: 2001-12-19 

; PRIOR APPLICATION NUMBER: 60/257,144 



; PRIOR FILING DATE : 2000-12-19 
; NUMBER OF SEQ ID NOS : 2292 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 566 
LENGTH: 138 0 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-225-567A-566 

Query Match 38.4%; Score 592.4; DB 15; Length 1380; 

Best Local Similarity 75.3%; Pred. No. 3.3e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2; 

Qy 39 GC AGAAT G G CACAGAAT TT AT CT T GT GAGAATT G GT T GGCAACAGAGG C TAT C T T GAAT A 98 

M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 50 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 10 9 

Qy 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 15 8 

M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | || 

Db 110 AGT ACT AC C T T T C CAT T TT T TAT GG GAT T GAGT T C GT T GT GGGAGT C C T T G GAAATAC C A 169 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

Ml M I I I M M I I I I I I I I I I I I I I I I I I I II II II I Ml I I I I I I I 
Db 17 0 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 229 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 27 8 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M | | | | | I I | | M I I I I I I 
Db 230 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 28 9 

Qy 279 AT G C CAAT GAT AAGGGGAC C TAT GGAGAT GT T C T C T GT ATAAGCAAC C GAT AT GT GCTT C 338 

I I M I I I I I M Ml I I II I I I I II I I I I I I I I I I I I I I I I I 1 I I I I I I I M 
Db 2 90 ATGCCAAT GGAAACT GGATATAT GGAGAC GT GCTCT GCATAAGCAAC CGAT AT GT GCTT C 34 9 

Qy 339 ACAC CAAC C T CT ACACC AGC AT CCTCTTCCT CACT T TCATT AGC AT GGAC C GAT AT CT GC 398 

I I I I I I I I II I I I I M I I I I II I I I I II I I I I II II I I I II M I I I II 
Db 350 AT G C CAAC CT CT AT AC C AG CAT TCTCTTTCT C ACT T TT AT C AGC ATAGAT C GAT ACTT GA 409 

Qy 399 T CAT GAAGTAC C C TT T C C GAGAACAC T T T CT ACAAAAGAAGGAATT T GC CAT T T TAAT C T 45 8 

I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II || I I I I I I I I I I | I | | | 
Db 410 TAAT T AAGT AT C CTT T C C GAGAACAC CT T CT GCAAAAGAAAGAGT T T GCT AT T T TAAT CT 469 

Qy 4 59 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACC CAT GCTCACTTT CAT CAAT T 518 

I I M I Mill I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 47 0 C CT T G G C CATT T G GGT T T T AGTAAC C T T AGAGT TACT AC C CAT AC T T C C C C T T ATAAAT C 52 9 

Qy 519 CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 578 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 530 CT GT T AT AACT GACAAT G GC AC C AC C T GTAAT GATT T T GCAAGT T CT G GAGAC C C CAAC T 589 

Qy 579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I J I I I I I I I I I I M I I I II II II I I I I I I I Mil! I I I I I I M I I I I I I 
Db 590 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 64 9 

Qy 639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 

I M I I I I I I M II II II II I II II I II II I Mill I I II III 

Db 650 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 7 09 



Qy 699 CTGCCCTGCCACTGGACA7\ACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I I I I INN II II II || | | I I I I I I I I I I I I II I I I I M | I I 

Db 710 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 769 

QY 759 T AC T CT T C AC AC C CT AT CAT AT CAT GC G CAAT T T GAG GAT C G C C T C AC GC C T GGAT AGT T 818 

I II II I I I I I I I I I I I I I I I I I I III I I I I I I I II I II I I I I I I I | | | | | 
Db 770 TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 82 9 

Qy 819 G GC C ACAAG GAT GT AC ACAGAAGG C CAT CAAAT C TAT AT AC AC ACT GAC AC G GC CT C 875 

I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 830 G GAAGC AGT AT C AGT GC ACT CAGGT C GT CAT CAAC T C C T T T T AC AT T GT GAC AC GG C CT T 889 

Qy 876 TGGCCTTTCT GAAC AGT GC C AT CAAT C C CAT CT T CT AC T T C C T CAT GGGAGAC CAT T AC A 935 

I I I I I M I I I I I I I I I I I I M I II II I I I I I I I II II I M I I I I II I II 
Db 890 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 94 9 

Qy 936 GAGAGAT GC T GAT T AGTAAGT T CAGACAATAC T T CAAGT C C C T T ACAT C C T T C AGGAC AT 995 

I I I I I I I I I I I I I I I I II I I I II I I I I I I II I I I I I I | | | | | | | | | 
Db 950 GG GAC AT G C T GAT GAAT CAACT GAGACACAAC T T CAAAT C C C T T ACAT C C T T T AGCAGAT 1009 

Qy 996 GAGCT GCT GGAT G CAG GT C T T C ACT C AGC CAAAA- T GAGAC ACT T GAT AAAC AG 104 8 

I I I I Ml I I I I I I I I I I I I I II I I I I II I I I I I I 

Db 1010 GGGCT C AT GAACT C CT AC T T T CAT T C AGAGAAAAGT GAGGG G CTT GT GAAAC AG 1063 



RESULT 10 
US-09-764-886-36 

; Sequence 36, Application US/09764886 

; Publication No. US20030139327A9 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al . 

; TITLE OF INVENTION: Nucleic Acids, Proteins, and Antibodies 

FILE REFERENCE: PTZ02 
; CURRENT APPLICATION NUMBER: US/09/764, 886 
; CURRENT FILING DATE: 2001-01-17 

; Prior application data removed - consult PALM or file wrapper 

; NUMBER OF SEQ ID NOS : 8 8 

; SOFTWARE: Patentln Ver. 2.0 

; SEQ ID NO 36 

; LENGTH: 1436 

; TYPE: DNA 

; ORGANISM: Homo sapiens 

US-09-764-886-36 

Query Match 38.4%; Score 592.4; DB 10; Length 1436; 

Best Local Similarity 75.3%; Pred. No. 3.3e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2; 

Qy 3 9 G CAGAAT G G C ACAGAATT T AT CT T GT GAGAAT T GGT T GG CAACAGAGGC T AT CTT GAAT A 98 

II INI I I I I I I I I I I I I I I I II I I II I I I I I I I I I 

Db 100 GGAT CAT GGCATG GAAT G CAACT TGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 159 

Qy 9 9 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

M I I I I I I M I I I I I | I M I I I I I I I I I I I I II I I I I I I I I I II 

Db 160 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 219 



Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 



22 0 TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 27 9 

219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 27 8 

M I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I I Mill 
2 80 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 339 

279 AT G C CAAT GAT AAGGGGAC CT AT GGAGAT GT T CT CT GT AT AAG CAAC C GAT AT GT GC T T C 338 

I I I I M Ml I I I M I I I I I I I I I I I I I I I | | | | | | | | || | | | M | | 

34 0 AT G C CAAT GGAAAC T GGAT AT AT GGAGAC GT GCT CT GC AT AAG CAAC C GAT AT GT GC T T C 399 

33 9 AC AC CAAC C T C T AC AC CAGCAT CCTCTTCCT CACT T T CAT TAG CAT G GAC CGAT AT C T G C 398 

I I I I I M I I I I I II I I I I I Mill II I II I II II IMM II I I I I I II 
4 00 AT GC CAAC C T CT AT AC CAGC ATT CT CT T T CT CACT T T TAT C AG CAT AGAT CGAT ACT T GA 45 9 

399 T CAT GAAGT AC C C T T T C C GAGAAC ACT T T C T ACAAAAGAAGGAAT T T GC C AT TT TAAT C T 45 8 

I M I I I I I I I I M M I I II II I I I II I II II I M I II Mill I I I II M II I 
4 60 TAAT T AAGT AT C CT T T C C GAGAAC AC CT T C T GCAAAAGAAAGAGT T T GCT AT TT TAAT C T 519 

459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I Mill I M I I I I II I I I I I I I II II II M I I M I I I 
52 0 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 57 9 

519 CT GT C C CAAAAGAAGAGGGCAGTAACT GCAT C GAC TAT GCAAGT T C T GGAAACC CT GAAC 578 

MM II M I I I I I I I I I I II I || I I I II II I M I M I I I 
58 0 CT GT T ATAACT GACAAT GGCAC C AC CT GTAAT GAT T T T G CAAGT T C T GGAGAC C CCAACT 63 9 

579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I I I I I I I M II I I II I || || M I I II I I I II II I I I I I I I M I II II I 
64 0 ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 699 

639 T GT GC T T C T T CT ACT ACAAGAT GGT AGT CTT CT T AAAGAG GAG GAG C CAG C AGCAAGCAA 698 

I I I I I I M I II I I I I I I M I II I I I II I I | | || | | | | 

700 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 759 

699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I M I HI I I I M II I I II II II I I I II II M 

760 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 819 

759 T AC T CT T CACAC C CT AT CAT AT CAT GC GCAATT T GAGGAT C GC CT C AC GC CT GGAT AGT T 818 

I M I I II I I I I I I I I I I M I I I I I I I II II I II II I II II II II II II II 
820 T GCT TT TT ACAC C C TAT CAC GT CAT G C GGAAT GT GAGGAT C G CT T C AC G C CT G G G GAGT T 879 

819 G GC C ACAAG GAT GT AC AC AGAAG GC CAT CAAAT C TAT AT ACAC ACT GAC AC GGC CT C 875 

I II I I I I I I I I I II I I I I II Mill I II II II II M 

8 8 0 G GAAGCAGT AT CAGT GC AC T CAG GT C GT CAT CAACTC CT TT T AC ATT GT GAC AC GG C CT T 939 

87 6 TGGCCTTTCT GAAC AGT GC CAT CAAT C C CAT CTT CTACT T C CT CAT GG GAGAC CAT T AC A 935 

I I M I I I I II || | | || | || I | || || | || M I II I I II M II I 

94 0 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 999 

936 GAGAGAT GCT GATTAGTAAGTT CAGACAATACTT CAAGT CCCTTACAT CCTT C AGGACAT 9 95 
M II I II II II I I I II II I | I II II I II I II II II I II I I II II I I 
1000 GGGACAT GCT GATGAAT CAAC T GAGAC ACAACTT CAAAT CCCTTACAT CCTT TAGCAGAT 1059 

996 GAGCT GCT GGAT GCAGGT CT T CACT CAGC CAAAA- T GAGACACTT GATAAAC AG 104 8 

I III Ml I I I I II I I I I I I II I I I I I I I MIMI 



10 60 GGG CT C AT GAAC T C CT AC T T T CAT T CAGAGAAAAGT GAG GGGCTTGT GAAAC AG 1113 



RESULT 11 
US-09-764-886-36 

; Sequence 36, Application US/09764886 

; Publication No. US20020086822A1 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al . 

; TITLE OF INVENTION: Nucleic Acids, Proteins, and Antibodies 
; FILE REFERENCE: PTZ02 

CURRENT APPLICATION NUMBER: US/09/764 , 886 

CURRENT FILING DATE: 2001-01-17 

Prior application data removed - consult PALM or file wrapper 
; NUMBER OF SEQ ID NOS : 8 8 

SOFTWARE: PatentlnVer. 2.0 
; SEQ ID NO 36 

LENGTH: 1436 
; TYPE: DNA 
; ORGANISM: Homo sapiens 
US-09-764-886-36 



Query Match 38.4%; Score 592.4; DB 13; Length 1436; 

Best Local Similarity 75.3%; Pred. No. 3.3e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2; 



Qy 


39 


G C AGAAT GG C ACAGAAT TT AT C T T GT GAGAAT T GGT T G GCAAC AGAG G C TAT C T T GAAT A 

ii iiiim 1 1 1 1 i 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 ii m i i 1 1 i i 

GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 


98 


Db 


100 


159 


Qy 


99 


AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I M 1 1 1 1 1 1 1 1 1 I M 1 1 1 1 II 
AGT AC T AC CT T T C CATT T T T T AT GGGAT T GAGT T C GTT GT GGGAGT C CT T GGAAATAC C A 


158 


Db 


160 


219 


Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 

IN II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 II M | 

T T GT T GT T TAC GG CT AC AT CTTCTCTCT GAAGAACT GGAACAGCAGTAAT AT T T AT CT CT 


218 


Db 


220 


279 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 

1 1 1 1 1 1 1 ii Milium Mill 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 I 1 II 1 II || 

TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


280 


339 


Qy 


279 


AT G C CAAT GAT AAG GG GAC CT AT G G AGAT GT T C T CT GT ATAAGCAAC C GAT AT GT GCT T C 

MINIMI II 1 II II 1 II 1 II || | | | || | | | || || | || | | | | || || | | | | 

AT GC CAAT GGAAACT GGAT AT AT GGAGACGT GCT CT GCATAAGCAAC C GATAT GT GCTTC 


338 


Db 


340 


399 


Qy 


339 


ACAC CAAC C T CT AC AC C AG CAT CCTCTTCCT C ACTT T CAT TAG CAT GGAC C GAT AT CT G C 

1 1 N II 1 1 M 1 II 1 1 1 II 1 Mill II II II 1 1 II 1 1 1 M || III. 1 ) || 

AT G C CAAC C TC TAT AC CAG CAT TCTCTTTCT C AC T T T TAT C AGCAT AGAT C GAT ACTT GA 


398 


Db 


400 


459 


Qy 


399 


T CAT GAAGT AC C C T T T C C G AGAAC ACT T T C T ACAAAAGAAGGAAT T T GC CAT T T T AAT CT 

1 M Mill 1 II II 1 II II II 1 II II II II 1 II II 1 II 1 1 M I I II 1 1 M 1 1 1 

TT^ATTAAGTATCCTTTCCGAGAACACCTTCTGCAAAAGAAAGAGTTTGCTATTTTAATCT 


458 


Db 


460 


519 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 
1 1 1 1 1 1 1 1 1 1 N 1 II 1 M 1 II M 1 1 II II II 1 II 1 1 II II 1 
CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 


518 


Db 


520 


579 



Qy 


519 


CT GT C C CAAAAGAAGAG G GC AGTAAC T G CAT C G AC TAT G CAAGT T C T GGAAAC C C T GAAC 

MM II II Mill I | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

C T GTT ATAAC T GACAAT G G C AC C AC CT GTAAT GAT T T T G CAAGT T C T G GAGAC C C CAAC T 


578 


Db 


580 


639 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 
1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 II II || | | | | M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 
ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


640 


699 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 
N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 I | | | 
TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


700 


759 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

I'll Mill II 1 | M 1 1 1 1 II 1 1 1 1 1 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


760 


819 


Qy 


759 


TACT CT T C AC AC C C TAT CAT AT CAT GC G C AAT TT GAG GAT C GC C T CAC G C CT G GAT AGT T 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | || 1 1 1 1 1 1 1 M M 1 | 1 | | MM 

T G CT T T T T AC AC C CT AT C AC GT CAT GC G GAAT GT GAGGAT C GCTT CAC GC C T GGGGAGT T 


818 


Db 


820 


879 


Qy 


819 


G G C C ACAAGGAT GT AC AC AGAAGG C CAT CAAAT CT ATAT ACAC ACT GAC AC GGC CT C 

1 II 1 1 1 1 1 1 1 1 1 1 1 M II II 1 1 1 1 1 1 1 1 II II 1 II | 
GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 


875 


Db 


880 


939 


Qy 


876 


TGGCCTTTCT GAAC AGT GC CAT CAAT C C CAT CT T CT ACT T C CT CAT GG GAGACC AT T AC A 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 II 1 I 1 1 1 II 1 I I | | | | 

TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


940 


999 


Qv 


936 


vji-L^nAjj-i. _L ± ± ± i 1 ± <^j-sxzi/v^,/\t\ 1 /\u i i {sPd\\j 1 ALA T CC TT CAGGAC AT 

1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 I II II 1 1 1 1 1 1 1 1 II || M 1 1 1 1 1 1 1 

GG GAC AT GC T GAT GAAT CAACT GAGACACAACT T CAAAT C C C T T ACAT C CT T T AGCAGAT 


995 


Db 


1000 


1059 


Qy 


996 


GAGC T GC T G GAT G C AGGT CT T CACT CAGC CAAAA- T GAGAC AC T T GAT AAAC AG 1048 
1 1 1 1 IM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II II 1 1 1 1 
GGGCTCATGAACTCCTACTTTCATTCAGAGAAAAGTGAGGGGCTTGTGAAACAG 1113 




Db 


1060 





RESULT 12 

US-10-264-237-1352 

; Sequence 1352, Application US/10264237 

; Publication No. US2004 0009491A1 

; GENERAL INFORMATION: 

; APPLICANT: Birse et al . 

; TITLE OF INVENTION: Nucleic Acids, Proteins, and Antibodies 
; FILE REFERENCE: PA131P1 

; CURRENT APPLICATION NUMBER: US/ 10/2 64 , 237 
; CURRENT FILING DATE: 2002-10-04 
; PRIOR APPLICATION NUMBER: PCT/US0 1/ 1 64 50 
; PRIOR FILING DATE: 2001-05-18 
; PRIOR APPLICATION NUMBER: US 60/205,515 
; PRIOR FILING DATE: 2000-05-19 
; NUMBER OF SEQ ID NOS : 2 8 76 
; SOFTWARE: Patentln Ver. 3.1 
; SEQ ID NO 1352 
; LENGTH: 1436 
TYPE: DNA 



ORGANISM: Homo sapiens 
US-10-264-237-1352 



Query Match 38.4%; Score 592.4; DB 16; Length 1436; 

Best Local Similarity 75.3%; Pred. No. 3.3e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2 



Qy 


39 


GCAGAAT GG C AC AGAAT TT AT CT T GT GAGAAT T GGT T G GCAACAGAGGC T AT C T T GAAT A 

M 1 M 1 1 1 1 1 1 1 Mill 1 1 1 1 1 1 1 II 1 1 1 1 1 I I M 1 Mill 

GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 


98 


Db 


100 


159 


Qy 


99 


AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 
MINIMI! II 1 1 1 1 1 1 1 1 II 1 1 1 1 II M 1 1 II 1 | | | || 1 1 II 
AGT ACT AC C T T T C CAT T TT TT AT GG GAT T GAGT T C GT T GT GGGAGT C C TT GGAAATAC C A 


158 


Db 


160 


219 


Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 
Ml 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 | | | | M | | | | | | | |M 1 Mill 1 
TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 


218 


Db 


220 


279 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 

NIMH II II 1 1 1 1 1 II 1 II 1 II II 1 1 1 1 M 1 M Mill 1 1 M II 1 Mill 

TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


280 


339 


Qy 


279 


AT GC CAAT GAT AAG GG GAC CT AT GGAGAT GT T C T C T GT AT AAGC AAC C GAT AT GT GCT T C 

N M II II 1 II III 1 II M 1 II 1 M 1 1 1 II II 1 1 M M II 1 II II 1 1 II II 

AT GC CAAT GGAAACT GGATATAT GGAGAC GT GCT CT GCATAAGCAACCGATAT GTGCTTC 


338 


Db 


340 


399 


Qy 


339 


AC AC CAAC CT C T AC AC C AGCAT C CT CT TC CT C AC TT T CAT T AGC AT G GAC C GAT AT CT GC 

1 1 1 M 1 1 M M II 1 1 II II II II II 1 1 1 1 1 II II | | | || | || | || | || 

AT GC CAAC CT CT AT AC CAGC ATT CT C T T T CT CAC T T T TAT CAGC AT AGAT C GAT AC TT GA 


398 


Db 


400 


459 


Qy 


399 


T CAT GAAGT AC C CT T T C C GAGAAC AC T T T CT ACAAAAGAAGGAATT T GC CAT T T TAAT CT 
1 II 1 1 1 M 1 1 1 M 1 M 1 1 1 M M II 1 1 1 II II 1 II II II 1 II II II II II II 
TAAT TAAGT AT C CT T T C C GAGAAC AC C T T CT GCAAAAGAAAGAGTT T GC T AT T T TAAT CT 


458 


Db 


460 


519 


Qy 


459 


C GC T GGCT GT C T GG GC CT TAGT GAC CT T AGAAGTT CT AC C CAT GCT CAC T TT CAT CAAT T 

1 1 1 1 1 M 1 1 1 II 1 II M M 1 II II 1 IMIM 

C CT T GGC C AT T T GG GT T T TAGTAAC CT TAG AGT TACT AC C CAT ACT T C C C CT T ATAAAT C 


518 


Db 


520 


579 


Qy 


519 


CT GT C C CAAAAGAAGAGG G C AGT AACT G CAT C GACT ATGCAAGT T CT GGAAAC C CT GAAC 

N 1 1 1 II Mill 1 II 1 M 1 II 1 II 1 II II M 1 1 

CT GT T ATAACT GACAAT G G CAC C AC CT GTAAT G AT TT T G CAAGT T CT GGAGAC C C CAACT 


578 


Db 


580 


639 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 

1 II 1 II 1 M 1 1 II 1 II 1 II II M II 1 1 1 1 1 1 1 1 1 1 1 1 !! 1 1 1 1 1 1 1 1 t 1 
i i i i i i i it (i i i I I I I I I l Mill 1 1 1 M I 1 | | 1 1 1 1 1 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


640 


699 


Qy 


639 


TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II M .1 I II II 1 1 II 1 II 1 1 || | | |M 

TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


700 


759 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MM | M II II II II II 1 I 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


760 


819 


Qy 


759 


T AC T CT T CAC AC C C TAT CAT AT CAT G C GCAAT T T GAG GAT C GC C T C AC GC C T GGAT AGT T 

1 II H Ml IMIM IMIM MM 


818 



Db 



82 0 T G CT T T T T AC AC C CT AT CAC GT CAT G C G GAAT GT GAG GAT C G CT T C AC G C CT G GGGAGT T 87 9 



QY 819 G G C C ACAAG GAT GT AC AC AGAAG GC CAT CAAAT C T AT AT AC ACAC T GACAC GGC CT C 875 

I II I II M IN I I I I I II I I I I I II I II I I I I I I M 

Db 880 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 939 

Qy 876 TGGCCTTTCT GAAC AGT GC CAT CAAT C C CAT C T T C T AC T T C CT C AT GGGAGAC C ATT ACA 935 

I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I | | | | | 

^ 940 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 999 

Qy 936 GAG AGAT G CT GAT T AGT AAGT T C AGACAAT AC T T CAAGT C C CT T AC AT CC TT CAG GAC AT 995 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I | I | | || | | | | | 
Db 1000 GGGACAT GCT GAT GAAT CAACTGAGACACAACTT CAAAT CC CTTACAT CCTT TAGCAGAT 1059 

Qy 996 GAGC T GCT GGAT GCAGGT C T T C ACT CAG C CAAAA- T GAGAC ACTT GATAAAC AG 1048 

I I I I Ml I I I I I I I I I I I I I I I II I I I I I I I I I | 

Db 1060 G G GC T CAT GAAC T C C TAC T T T CATT CAGAGAAAAGT GAGG GG CT T GT GAAAC AG 1113 



RESULT 13 
US-10-311-671-20 

Sequence 20, Application US/10311671 
Publication No. US20040072996A1 
GENERAL INFORMATION: 
APPLICANT: INCYTE GENOMICS, INC. 



APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
TITLE OF INVENTION 
FILE REFERENCE: PI 



LAL, Preeti G. 
BAUGHN, Mariah R. 
HAFALIA, April J. A. 
NGUYEN, Danniel B. 
GANDHI, Ameena R. 
KALLICK, Deborah A. 
GRIFFIN, Jennifer A. 
YUE, Henry 
KHAN, Farrah A. 
ARVIZU, Chandra S. 
LU, Dyung Aina M. 
TRIBOULEY, Catherine M. 
LU, Yan 

CHAWLA, Narinder K. 
GRAUL, Richard 
YAO, Monique G. 
YANG, Junming 
RAMKUMAR, Jayalaxmi 
AU- YOUNG, Janice K. 
ELLIOTT, Vicki S. 
HERNANDEZ, Roberto 
WALSH, Roderick T. 
BOROWSKY, Mark L. 
THORNTON, Michael B. 
HE, Ann 

G-PROTEIN COUPLED RECEPTORS 
0131 USN 



CURRENT APPLICATION NUMBER: US/10/311, 671 

CURRENT FILING DATE: 2002-12-16 

PRIOR APPLICATION NUMBER: PCT/US01/ 1927 5 

PRIOR FILING DATE: 2001-06-15 

PRIOR APPLICATION NUMBER: 60/212,483 



; PRIOR FILING DATE: 2000-06-16 

; PRIOR APPLICATION NUMBER: 60/213,954 

; PRIOR FILING DATE: 2000-06-22 

; PRIOR APPLICATION NUMBER: 60/215,209 

; PRIOR FILING DATE: 2000-06-29 

; PRIOR APPLICATION NUMBER: 60/216,595 

; PRIOR FILING DATE: 2000-07-07 

; PRIOR APPLICATION NUMBER: 60/218,936 

; PRIOR FILING DATE: 2000-07-14 

; PRIOR APPLICATION NUMBER: 60/219,154 

PRIOR FILING DATE: 2000-07-19 
; PRIOR APPLICATION NUMBER: 60/220,141 

PRIOR FILING DATE: 2000-07-21 
; NUMBER OF SEQ ID NOS : 35 
; SOFTWARE: PERL Program 
; SEQ ID NO 20 

LENGTH: 1542 
; TYPE : DNA 

ORGANISM: Homo sapiens 

FEATURE: 

NAME/ KEY : misc_f eature 

OTHER INFORMATION: Incyte ID No: 3485895CB1 
US-10-311-671-20 

Query Match 38.4%; Score 592.4; DB 12; Length 1542; 

Best Local Similarity 75.3%; Pred. No. 3.5e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2; 
QY 39 GC AGAAT GGCACAGAAT T TAT CT T GT GAGAAT T GGT T GGCAAC AGAG G CT AT CT T GAAT A 98 

M MINI MM I M M I M M I I I I I I I I I I I I I I Mill 

2 05 GGATCATGGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 264 
QY 99 AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 158 

M M M M M M M M M I M M M M M I M I I M M I M I M 

Db 2 65 AGT AC T AC CTT T C CAT T T TT T AT GGGAT T GAGT T C GTT GT GGGAGTC C T T GGAAATAC C A 324 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

MI M I M I M M I M M I M I M I M M M I M M M Ml I I I I I I I 

Db 325 T T GT T GT T TAC GG C T AC AT C T T CT CT CT GAAGAACT GGAAC AGC AGTAAT AT T TAT CT CT 38 4 

Qy 219 TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 27 8 

M M M I M M M M M M M M I M M M M M I I I I I I I I I M M I I I I I 

Db 385 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 44 4 

Qy 27 9 AT GC CAAT GAT AAGG G GAC CTAT G GAGAT GT T C T CT GT ATAAGCAAC C GAT AT GT GCT T C 338 

I M M M M I I Ml I II I I I I I II I | | | | I I I I II I I I | | | | | | | | | | M I 
Db 445 AT G C CAAT G GAAAC T G GAT AT AT GGAGAC GT GCT CT GC ATAAGCAAC C GAT AT GT G C T T C 504 

Qy 339 AC AC C AAC CT CT AC AC C AGC AT CCTCTTCCT C ACT T T CAT T AGC AT G GAC C GAT AT CT GC 398 

I MM I | M I I I I I I I I I I I I MIM II I I II I II 

Db 505 AT G C CAAC CT CTAT AC C AGC AT TCTCTTTCT C ACTT T TAT C AGC AT AGAT C GAT ACTT GA 564 

Qy 3 99 T CAT GAAGTAC CCT T T C C GAGAAC ACT T T CT AC AAAAGAAGGAAT T T GC C AT T TT AAT CT 4 58 

I M M M I I I I I II M M I II I I M I M 

Db 565 T AAT T AAGT AT C CT T T C C GAGAAC AC CT T CT GCAAAAGAAAGAGT T T GC TAT T T T AAT CT 62 4 



Qy 459 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 



Db 


625 


1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 Mill! 
CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 


684 


Qy 


519 


CTGTCCCAAAAGAAGAGGGCAGTAACTGCATCGACTATGCAAGTTCTGGAAACCCTGAAC 
1 1 1 1 II II 1 1 1 1 1 Mill 1 1 II 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 
CT GTT AT AAC T GAC AAT G G C AC C AC C T GT AAT GAT T T T GCAAGT T CT GGAGAC C C CAAC T 


578 


Db 


685 


744 


Qy 


579 


ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 
1 I 1 1 M 1 1 1 1 1 1 1 M 1 1 II II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 
ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


745 


804 


Qy 


639 


TGTGCTTCTT CT AC T ACAAGAT G GTAGT CT T C T T AAAGAGGAG GAG C C AG C AGCAAGCAA 
1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I MM | | | 
TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


805 


864 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 
MM 1 1 1 1 1 II II II II 1 Mill 1 1 1 II 1 1 M 1 II 1 M II II 
CTGCTCTGCCCCTTGAAAAGCCTCTC7UVCTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


865 


924 


Qy 


759 


T AC T C TT C AC AC C CT AT CAT AT CAT GC GCAAT TT GAGGAT C GC CT CAC G C CT GGAT AGT T 

> M II 1 II 1 II 1 1 II 1 1 II 1 1 1 1 Ml 1 1 1 1 1 1 1 II 1 II II 1 1 M II MM 

TGCTTTTTACACCCTAT CAC GT CAT GCGGAATGT GAGGAT CGCTTCACGCCTGGGGAGTT 


818 


Db 


925 


984 


Qy 


819 


G G C C ACAAGGAT GT AC AC AGAAGGC CAT CAAAT CT AT AT AC ACACT GAC AC GGC CT C 

1 N 1 1 1 II 1 1 1 II II II 1 M Mill 1 II II M 1 M 1 
GGAAG C AGT AT C AGT GC AC T C AGGT C GT CAT CAACT C CT T T T AC AT T GT GAC AC G GC C T T 


875 


Db 


985 


1044 


Qy 


876 


TGGCCTTT CT GAAC AGT G C CAT CAAT C C CAT C TT CT AC TT C CT CAT GGGAGAC CAT T AC A 
1 M II 1 II II 1 1 1 1 1 1 | | M | | | | || | | | | | | | || || 1 || M 1 1 II 1 II 
TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 


935 


Db 


1045 


1104 


Qy 


936 


GAGAGAT GCT GAT T AGT AAGT T C AGACAAT ACT T CAAGT C C CT T AC AT C C T T CAG GAC AT 

1 M II 1 1 1 1 II 1 II 1 1 II II | | M 1 1 M 1 II II II 1 1 II 1 1 1 M 1 1 

G G GACAT GC T GAT GAAT CAACT GAGAC ACAAC T T CAAAT C C CT T AC AT C CT T TAG C AGAT 


995 


Db 


1105 


1164 


Qy 


996 


GAGCT G CT G GAT GC AGGT C T T CACT C AGC CAAAA- T GAGACACTT GATAAAC AG 104 8 

1 1 M Ml 1 M 1 II 1 1 1 1 II 1 1 1 II 1 1 II || | || | 

GG G C T CAT GAACT C CT ACTT T CAT T C AGAGAAAAGT GAGGGGCTT GT GAAAC AG 1218 




Db 


1165 




RESULT 14 
US-09-764-886- 
; Sequence 11, 
; Publication 


11 

Application US/09764886 
No. US20030139327A9 





GENERAL INFORMATION: 
APPLICANT: Rosen et al . 

TITLE OF INVENTION: Nucleic Acids, Proteins, and Antibodies 
FILE REFERENCE: PTZ02 

CURRENT APPLICATION NUMBER: US/09/764 , 886 
CURRENT FILING DATE: 2001-01-17 

Prior application data removed - consult PALM or file wrapper 
NUMBER OF SEQ ID NOS : 88 
SOFTWARE: PatentlnVer. 2.0 
SEQ ID NO 11 

LENGTH: 4232 

TYPE: DNA 

ORGANISM: Homo sapiens 



US-09-764-886-11 



Query Match 38.4%; Score 592.4; DB 10; Length 4232; 

Best Local Similarity 75.3%; Pred. No. 6.6e-139; 

Matches 764; Conservative 0; Mismatches 246; Indels 4; Gaps 2; 



Qy ■ 


39 


GCAGAAT GGCACAGAATT T AT CT T GT GAGAATT GGTT GGCAACAGAGGCT AT CTT GAATA 

II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 | | | | || | | | | | | | M Mill 

GGATCATGGCATGGAATGCAACTTGC/W^AACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 


98 


Db 


110 


169 


Qy 


99 


AGTACTACCTCTCTGCATTTTATGCAATCGAGTTCATTTTTGGACTGCTTGGGAATGTCA 
N 1 1 1 M II 1 I I 1 1 1 1 1 1 1 1 I | M 1 1 1 1 1 1 1 1 I | M | | | || | || 
AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 


158 


Db 


170 


229 


Qy 


159 


CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 
Ml M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 I I I I I | | | | | | | | || | | | | | | | | 1 1 1 1 
TTGTTGTTTACGGCTACATCTTCTCTCTGAAGAACTGGAACAGCAGTAATATTTATCTCT 


218 


Db 


230 


289 


Qy 


219 


TTAACCTTTCCATCTCTGACTTTGCTTTCCTGTGCACCCTTCCCATCCTGATAAAGAGTT 

1 1 1 1 1 1 1 M 1 1 1 1 1 1 | M IN 1 1 II I I | | | | M 

TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 


278 


Db 


290 


349 


Qy 


279 


AT G C CAAT GAT AAG GGGAC CT AT GGAGAT GTT CT CT GT AT AAGCAAC C GAT AT GT GCT T C 

II 1 1 1 1 1 1 1 1 II 1 I I I | | | | | | | || | | | | | | | | | | | | | | 

AT G C CAAT GGAAAC T G GAT AT AT GGAGAC GT GCT CT GCAT AAGCAAC C GAT AT GT G CT T C 


338 


Db 


350 


409 


Qy 


339 


AC AC CAAC CT CTAC AC CAG CAT C CT CTT C CT C ACT T T CAT T AGCAT GGAC C GAT AT CT GC 

1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | M 

AT G C CAAC CT CT AT AC CAG CAT T CT CTT T CT C ACT T T TAT CAGC AT AGAT C GAT ACT T GA 


398 


Db 


410 


469 


Qy 


399 


T CAT GAAGT AC C C T T T C C GAGAACACT T T CT ACAAAAGAAGGAAT T T GC CATT T T AAT CT 

1 II 1 II 1 1 I 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 

TAAT TAAGT AT CCT T T C C GAGAACACC TT CT GCAAAAGAAAGAGT T T GCT ATT T T AAT CT 


458 


Db 


470 


529 


Qy 


459 


CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 
1 1 1 1 1 Mill 1 1 1 1 1 1 1 1 1 1 | || 1 1 1 1 1 | M 1 1 1 1 1 1 1 1 1 I 
CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 


518 


Db 


530 


589 


Qy 


519 


C T GT C C C AAAAGAAGAG G GCAGT AACT GCAT C GACT AT GCAAGT T CT GGAAAC CCT GAAC 

1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 

CT GT TAT AACT GAC AAT G GCAC CAC CT GT AAT GAT T T T G CAAGT T CT GGAGAC CC CAACT 


578 


Db 


590 


649 


Qy 


579 


ACAATCT CATT TACAGCCTCTGCCTGACTTT GTT GGGCTTCCTAATT CCT CTCTCTGTGA 

1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 M II II 

ACAACCTCATTTACAGCATGTGTCTAACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 


638 


Db 


650 


709 


Qy 


639 


TGTGCTTCTTC T ACT ACAAGAT GGT AGT CTT CT TAAAGAG GAGGAGC C AGC AGCAAG C AA 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 I | | | Ml 
TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 


698 


Db 


710 


769 


Qy 


699 


CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 

1 1 1 1 N II 1 1 1 1 1 1 1 1 1 1 I II 1 1 I 

CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 


758 


Db 


770 


829 


Qy 


759 


T AC T C T T CACAC C C TAT CAT AT CAT GC G CAAT T T GAG GAT C GC CT CAC GC C T G GAT AGT T 
1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 I || I I I 1 
TGCTTTTTACACCCTATCACGTCATGCGGAATGTGAGGATCGCTTCACGCCTGGGGAGTT 


818 


Db 


830 


889 



Qy 819 G G C CACAAG GAT GT ACAC AGAAG GC CAT C AAAT C TAT AT AC AC AC T G AC AC GGC C T C 8 75 

I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 890 G GAAGC AGT AT C AGT G C AC T C AGGT CGT CAT CAACT C C T T T T ACAT T GT GAC AC GGC C T T 949 

Qy 876 TGGCCTTTCT GAAC AGT G C CAT CAAT C C CAT C T T CT ACT T C C T CAT G GGAGAC C ATT AC A 935 

I I I I I I M I I I I I I I I I I I I I I I I || I I I I I I I II || | | || | M || | || 
Db 950 TGGCCTTTCTGAACAGTGTCATCAACCCTGTCTTCTATTTTCTTTTGGGAGATCACTTCA 1009 

Qy 93 6 GAGAGAT GCT GATTAGTAAGTT CAGACAAT ACTT CAAGT CCCTTACAT CCTT CAGGACAT 995 

I I I I M I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I . 
Db 1010 GGGACAT GCT GAT GAAT CAACT GAGACAC AACT T CAAAT C C C T T AC AT CCTT TAGCAGAT 1069 

Qy 996 GAGCT G C T GGAT GC AG GT C T T CAC T C AGC C AAAA- T GAGAC ACT T GAT AAAC AG 1048 

I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 1070 GGGCTCATGAACTCCTACTTTCATTCAGAGAAAAGTGAGGGGCTTGTGAAACAG 1123 



RESULT 15 
US-09-764-886-11 

Sequence 11, Application US/09764886 
Publication No. US20020086822A1 
GENERAL INFORMATION: 
APPLICANT: Rosen et al . 

TITLE OF INVENTION: Nucleic Acids, Proteins, and Antibodies 
FILE REFERENCE: PTZ02 

CURRENT APPLICATION NUMBER: US/09/764 , 8 8 6 
CURRENT FILING DATE: 2001-01-17 

Prior application data removed - consult PALM or file wrapper 
NUMBER OF SEQ ID NOS : 88 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 11 
LENGTH: 4232 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-09-764-886-11 

Query Match 38.4%; Score 592.4; DB 13; Length 4232; 

Best Local Similarity 75.3%; Pred. No. 6.6e-139; 

Matches 764; Conservative 0; Mismatches 24 6; Indels 4; Gaps 2; 

Qy 39 GC AGAAT GGC ACAGAAT T T AT CT T GT GAGAAT T GGT T G GCAAC AGAGGC T AT CT TGAAT A 98 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I Mill 
Db 110 GGAT CAT GGCATGGAATGCAACTTGCAAAAACTGGCTGGCAGCAGAGGCTGCCCTGGAAA 169 

Qy 99 AGT ACT AC C T CTCT G CAT T T TAT GCAAT C GAGT T CAT T TT T G GACT GCT T GGGAAT GT C A 158 

I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I | I I I I I I || 

Db 170 AGTACTACCTTTCCATTTTTTATGGGATTGAGTTCGTTGTGGGAGTCCTTGGAAATACCA 229 

Qy 159 CTGTGGTGTTCGGCTACCTCTTCTGCATGAAGAACTGGAACAGCAGCAATGTCTATCTTT 218 

III II I I I I II I I I I I I II I I I I I I I I I I II I I I I I I I III I I I I I I I 
Db 230 T T GT T GT T T AC GG CT ACAT CTTCTCTCT GAAGAAC T GGAAC AG CAGT AAT AT T TAT CTCT 289 

Qy 219 TTAACCTTTC CAT CTCTGACTTTGCTTTCCTGTGCAC CCTT CCCATCCT GAT AAAGAGTT 278 

I M I I I I M I I I I I II I I I I I I I I I I I I I I I I I II III!) I I I II I I II I I I 
Db 2 90 TTAACCTCTCTGTCTCTGACTTAGCTTTTCTGTGCACCCTCCCCATGCTGATAAGGAGTT 34 9 



Qy 279 AT G C CAAT GATAAG G GGAC C TAT GGAGAT GT T C T CT GT ATAAGC AAC CGAT AT GT G C T T C 338 

I I I I I I I I I II III I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I 
Db 350 AT GC CAAT GGAAAC T G GAT AT AT G GAGAC GT GCT CT GCAT AAGCAAC CGAT AT GT GC T T C 409 

Qy 339 AC AC C AAC CT CT ACAC C AGC AT CCTCTTCCT C ACT T T CAT TAG CAT GGAC C GAT AT C T G C 398 

I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I II I I I I I II II I I I || 
Db 410 AT GC C AAC C T C TAT AC C AG CAT TCTCTTTCT C AC T T T TAT C AGC AT AGAT C GAT ACT T GA 4 69 

Qy 399 T CAT GAAGT AC C C T T T C C GAGAAC AC T T T CT AC AAAAGAAG GAAT TT G C CAT T T T AAT CT 4 58 

I II II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I II II I I I I I I I I II 
Db 47 0 TAAT T AAGT AT C CT T T C C GAGAAC AC C T T C T GCAAAAGAAAGAGT TT GCT AT T T T AAT CT 529 

Qy 4 59 CGCTGGCTGTCTGGGCCTTAGTGACCTTAGAAGTTCTACCCATGCTCACTTTCATCAATT 518 

I I I I I I I I I I I I I I I I I I II I I I I I I I II II I I I I I I I I I I 
Db 530 CCTTGGCCATTTGGGTTTTAGTAACCTTAGAGTTACTACCCATACTTCCCCTTATAAATC 58 9 

Qy 519 CT GT C C CAAAAGAAGAGGG CAGT AAC T GCAT C GACT AT G CAAGT T CT GGAAAC C C T GAAC 578 

I I I I II II I I I I I I I I I I I I I I II I I I I I I I I II I I I I I 
Db 590 CT GT T ATAACT GACAAT G GC AC C AC CT GT AAT GAT T T T GCAAGTT C T GGAGAC C C CAACT 64 9 

Qy 579 ACAATCTCATTTACAGCCTCTGCCTGACTTTGTTGGGCTTCCTAATTCCTCTCTCTGTGA 638 

I I I I I I I I I I I I I II I I II II II I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 650 ACAACCTCATTTACAGCATGTGTCT7\ACACTGTTGGGGTTCCTTATTCCTCTTTTTGTGA 709 

Qy 639 TGTGCTTCTTCTACTACAAGATGGTAGTCTTCTTAAAGAGGAGGAGCCAGCAGCAAGCAA 698 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 710 TGTGTTTCTTTTATTACAAGATTGCTCTCTTCCTAAAGCAGAGGAATAGGCAGGTTGCTA 769 

Qy 699 CTGCCCTGCCACTGGACAAACCCCAACGCCTGGTGGTCCTGGCGGTTGTGATCTTCTCTA 758 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I 

Db 770 CTGCTCTGCCCCTTGAAAAGCCTCTCAACTTGGTCATCATGGCAGTGGTAATCTTCTCTG 829 

Qy 759 T AC T C T TCAC AC C CT AT CAT AT CAT GC GC AAT TT GAGGAT C G C CT CAC GCCT GGATAGTT 818 

I II II I I I I I I I I II I II I I I I I III I I I I I II I I I I I I I I I I I I I MM 
Db 830 T G CT T T TTACAC C CT AT CACGT CAT G C GGAAT GT GAGGAT C G CTT CAC GCCT GG GGAGT T 889 

Qy 819 G GCCACAAGGATGTACACAGAAGGCCAT CAAAT CTAT ATACACACT GACACGGCCT C 875 

I II I I I M I I I I II II II I I I II II I I M II I I II I 

Db 890 GGAAGCAGTATCAGTGCACTCAGGTCGTCATCAACTCCTTTTACATTGTGACACGGCCTT 94 9 

Qy 876 T GG C CT TT CT GAACAGT GC CAT CAAT C C CAT C T T CT ACT T C C T CAT GG GAGAC CAT T AC A 935 

II I I I II II I I I I I I II I I I I I I I II II II II I II II II M I I I M I II 

Db 950 TGGCCTTTCT GAACAGT GT CAT CAACCCTGT CTT CTATTTTCTTTTGGGAGATCACTTCA 1009 

Qy 936 GAGAGAT GCT GAT T AGT AAGT T C AGACAAT ACT T CAAGT C C CT T AC AT C CT T C AG GAC AT 995 

I I I I II I I I II I I I I II I I I II II II I II I II II I II I I M I I I II 
Db 1010 G G GACAT GCT GAT GAAT C AAC T GAGAC ACAACT T CAAAT C C CT T AC AT C CT T T AGCAGAT 10 69 

Qy 996 GAG CT GCT G GAT G C AGGT CTT C ACT C AG C CAAAA- T GAGAC AC T T GAT AAAC AG 104 8 

I II I III I I I I I I I II II I I I II I I II I I I II I I 

Db 1070 GGGCTCATGAACTCCTACTTTCATTCAGAGAAAAGTGAGGGGCTTGTGAAACAG 1123 

Search completed: August 24, 2004, 18:00:11 
Job time : 7 5 3 s e c s 



